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Abstract. This paper provides an introduction to quantum filtering theory. An introduction to 
quantum probability theory is given, focusing on the spectral theorem and the conditional expectation 
as a least squares estimate, and culminating in the construction of Wiener and Poisson processes 
on the Fock space. We describe the quantum Ito calculus and its use in the modelling of physical 
systems. We use both reference probability and innovations methods to obtain quantum filtering 
equations for system-probe models from quantum optics. 
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1. Introduction. Since even before the industrial revolution feedback control 
has played a major role in the development of technology. Nowadays many machines 
and devices that make up our everyday lives use feedback to provide efficient and 
reliable performance despite the ever increasing complexity and miniaturization, and 
a rich control theory has been developed to aid in the design of feedback controllers 
based on device models from classical physics. As microtcchnology is making way 
for nanotcchnology, however, we are now rapidly approaching the boundary of the 
classical world past which the effects of quantum mechanics cannot be neglected. 

The laws of quantum mechanics tell us that any description of the phenomena 
at small scales is inherently nondetcrministic in nature. This opens new areas of 
application for stochastic control theory, which could play an important role in a 
future generation of technology. In particular, as observations of quantum systems 
are inherently noisy, the theory of filtering — the extraction of information from a noisy 
signal — forms an essential part of any quantum feedback control strategy. 

Quantum filtering was already implicit in early work on quantum measurement 
theory by Davies in the 1960s [""21 GUI- In its modern form, the study of quantum 
filtering and control was pioneered by Belavkin in a series of articles dating back to the 
early 1980s [9*1 llUllTTl I12H13] . The theory developed by Belavkin provides an essential 
foundation for statistical inference in e.g. quantum optical systems, and much of 
what we will discuss in the second half of this article is based on his work. The theory 
gained popularity in the physics community after it was independently developed on 
a more heuristic level by Carmichael in the early 1990s |2()j under the name "quantum 
trajectory theory" and has since been widely applied in the description of quantum 
optical experiments and as a computational tool. 

Based on the foundations of quantum filtering theory, methods from classical 
nonlinear and stochastic control can be developed and applied to design feedback 
control laws for quantum systems. These methods may be optimal in some sense, or 
otherwise designed with relevant considerations in mind (e.g. stability). The resulting 
controllers are intended to be implemented with some classical technology (e.g. digital 
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or analog electronics). Recent experiments implementing quantum feedback controls 
\'6'6[ I34j have led to renewed interest in the field which is now rapidly expanding 

[ini ins uni ehi czi S2i cni nsi E3i Eni ohi esi eoi cbi we behove that & fruitful 

interaction between stochastic control and theoretical and experimental physics will 
be essential in paving the way towards the engineering of quantum technologies. 

This paper provides an introduction to quantum filtering theory There are three 
key ingredients that are required for the development of the theory. First, we need 
to capture both classical probability and quantum mechanics within the framework 
of a generalized probability theory, called noncommutative or quantum probability 
theory The central object in this theory, the spectral theorem, provides a link between 
quantum systems and the associated probabilistic measurement outcomes. Second, 
we need a noncommutative generalization of the concept of conditional expectations. 
As in classical probability, we will find that a suitably restricted definition of the 
quantum conditional expectation is none other than a least squares estimator, which 
elucidates its role in quantum filtering theory. Finally, we need a noncommutative 
analog of stochastic calculus and quantum stochastic differential equations (QSDE). 
This provides a broad class of models for which we can obtain filtering equations. 

A typical physical scenario, to which the theory that we will develop can be 
applied, is illustrated schematically in Figure A cloud of (usually cold, trapped) 
atoms interacts with the electromagnetic field in free space; this can be coherent light 
from a laser, or even the vacuum. Depending on their internal state the atoms can, 
for example, emit radiation into the field. If we detect this radiation using an optical 
detection setup we can try to infer some information on the internal state of the 
atoms — this is precisely the goal of quantum filtering theory. If we wanted to control 
the state of the atoms, we could then feed back some function of the state estimates 
through a suitable actuator. Recent laboratory experiments, e.g. |34| . implement 
precisely such a setup, and provide a motivating example for the theory. 

We begin in |2]by providing some background for quantum filtering. This includes 
a discussion of the quantum mechanics and quantum probability in the simplest, finite- 
dimensional context. In <J|] quantum probability is developed in detail. Then in 2] we 
show how Wiener and Poisson processes emerge in a particular quantum probabilistic 
model based on the Fock space, and how these can be used to develop a noncommu- 
tative stochastic calculus. In fJSJwc introduce a class of system-observation models 
that describe typical experiments in quantum optics. 3B] deals with the derivation of 
quantum filtering equations using the reference probability approach, while [J2 gives 
an alternative derivation using the innovations or martingale method. 

Scope. It has been our aim to make quantum probability and filtering theory 
accessible, modulo a set of technicalities, to readers with a minimal number of prereq- 
uisites. We (only) presume some familiarity with probability theory and elementary 
functional analysis. We have put an emphasis on introducing the mathematical struc- 
tures of quantum probability theory and on demonstrating their significance and their 
use. As a consequence we do not everywhere achieve the highest level of rigor; we are 
particularly lax in the use of unbounded operators and their domains. It is our hope 
that skimming over these technicalities has enabled us to paint a clearer picture of 
the pillars of the theory and of the essential techniques involved. That being said, we 
should point out that many of the tools described in this paper arc applied regularly 
and successfully by physicists without paying any attention to the technical issues 
involved; the reader should not hesitate to get his feet wet! 

It is an ambitious project to introduce an unfamiliar probability theory, a new 
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stochastic calculus, and to even solve a nontrivial problem (filtering) within the con- 
fines of about 40 pages. Though we have tried to give a pedagogical treatment, the 
explanations are sometimes necessarily terse; we hope that the reader will be suf- 
ficiently compelled to work his way through the paper. Needless to say there are 
many omissions; one that particularly deserves mention is the linear case: indeed, the 
quantum Kalman filter, and the corresponding theory of quantum LQG control, can 
be developed along similar lines to the filters we will discuss. We have chosen to omit 
this topic in order to avoid the technicalities of QSDEs with unbounded coefficients, 
but refer instead to [28] and the references therein. 

Notation. The sets of natural, real and complex numbers are denoted N, R 
and C respectively. In general, script symbols (e.g. are used for von Neumann 
algebras, while calligraphic symbols (e.g. y) stand for er-algebras. B is the Borel a- 
algcbra on R. Classical probability spaces are denoted as (CI, T, P), and Ep denotes 
the expectation with respect to the measure P. Blackboard symbols (e.g. P) denote 
states on von Neumann algebras. Sans-serif symbols (e.g. H) are used for Hilbert 
spaces. Hilbert space adjoints, as well as the scalar complex conjugate, are indicated 
by *, and the Hilbert space inner product is denoted by (■,■). The commutator of two 
bounded operators is denoted by [X, Y] = XY — YX. I is the identity operator. 

2. Background and motivation. In this article we adopt a modern quantum 
probability formulation of quantum mechanics. Quantum probability is the noncom- 
mutative counterpart of Kolmogorov's axiomatic characterization of classical probabil- 
ity theory. In addition to the natural interpretation and mathematical tools provided 
by Kolmogorov's formalism, one of its major successes is that conditioning is a derived 
concept rather than an additional axiom. The situation is much the same in quantum 
probability; in particular, the conditioning axiom or "projection postulate" as it is 
traditionally posed in quantum mechanics can emerge as a consequence of conditional 
expectation and the physical idea that in a single experiment one only has direct 
access to information contained in a commutative subalgebra of observables. 

Considering the success of the classical (Kolmogorov) theory, it should come as 
no surprise that the mathematical abstraction provided by the framework of quantum 
probability pays off significantly (as we will sec throughout the article) . Introductory 
physics textbooks on quantum mechanics rarely use such a description, however. In 
this section we introduce the basic concepts of quantum probability in their simplest 
form, and attempt to provide contact with ideas about quantum mechanics that read- 
ers may be familiar with. This is intended to provide a reference point for interpreting 
the quantum probabilistic framework used in this paper. 

2.1. Some textbook quantum mechanics. According to the textbook by 
Merzbacher |5U1 page 1], "Quantum mechanics is the theoretical framework within 
which it has been found possible to describe, correlate, and predict the behavior of a 
vast range of physical systems, from particles through nuclei, atoms and radiation to 
molecules and condensed matter." Central to quantum mechanics are the notions of 
observables, which are mathematical representations of physical quantities that can 
(in principle) be measured, and states, which summarize the status of physical systems 
and permit the calculation of statistical quantities (such as probabilities, expectations, 
correlations) of observables. 

Indeed, the reader may be familiar with the Schrddinger wavefunction ip(q,t) for 
a particle of mass m moving in a force field V(q) (dependent on position q, in one 
dimension for simplicity) . If Q is the observable representing position (defined below 
in Example 13. 9f) . the expected position of the particle when in a state described by 
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ip(q, t) at time t is defined to be 

(Q) = I qmq,t)\ 2 dq. (2.1) 



The wavcfunctions are normalized to one / \ip{q,t)\ 2 dq = 1, so that \-ip(q,t)\ 2 could be 
interpreted as the probability density of the position of the particle. The dynamics of 
the particle arc described by the famous Schrodinger equation 

. ^(q,t) K 2 & 2 ^{q,t) 

lh —dT = ^^q^ + M 

where h ~ h/2ir, h is Planck's constant, and i 2 = —1. 

The key distinction between classical (i.e. non-quantum) and quantum mechanics 
is that quantum mechanics is noncommutative, meaning that there exist observables 
that do not commute, a fact which has deep implications. The momentum observable 
P (defined below in Example \3. 9(1 does not commute with the position observable Q; 
in fact [Q, P] = QP — PQ = ihl. The most famous implication of this failure of 
commutativity is Heisenberg's uncertainty relation, which asserts that 

AQAP>±\{i[Q,P})\ = ^ (2.3) 

where the variances are defined by AQ = ((Q 2 ) - (Q) 2 ) 1/2 , AP = {(P 2 ) - (P) 2 ) 1 / 2 . 
Naive interpretation of the Heisenberg uncertainty relation can be misleading; we 
will discuss its precise meaning in the following section. Nonetheless, it evidently 
implies that there is a fundamental irreducible randomness in quantum mechanics. 
This is in contrast to classical randomness, which in principle can be eliminated with 
enough effort and information. Experimental evidence has repeatedly confirmed the 
irreducible randomness of quantum mechanical observations. 

Let us make this somewhat vague discussion a little more precise. For simplicity, 
we will work in this section only in a finite-dimensional setting (in which observations 
can only take a finite number of values, i.e. they are finite-state random variables). 
First, recall that if A = A* is a self-adjoint operator on a finite dimensional Hilbert 
space H = C™, it has at most n (distinct) real eigenvalues. The set spec(A) = {aj} 
of eigenvalues of A is called the spectrum of A, and A can be written as 

a= oi> «> ( 2 - 4 ) 

aGspcc(A) 

where P a is the projection operator onto the subspace of H spanned by vectors with 
eigenvalue a. The projections resolve the identity Saespcc(yt) ^ a = ^ ■ 

In this finite-dimensional setting, the following operational characterization of 
quantum mechanical models (often referred to as the "postulates" of quantum me- 
chanics) can be found in most introductory textbooks. 

Observables. Physical quantities like position, momentum, spin, etc., are rep- 
resented by self-adjoint operators on the Hilbert space H and are called observables. 
These are the noncommutative counterparts of random variables. 

States. A state is meant to provide a summary of the status of a physical system 
that enables the calculation of statistical quantities associated with observables. A 
generic state is specified by a density matrix p, which is a sclf-adjoint operator on 
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H that is positive p > and normalized Tr[p] = 1. This is the noncommutativc 
counterpart of a probability density. 

Measurement. A measurement is a physical procedure or experiment that pro- 
duces numerical results related to obscrvablcs. In any given measurement, the allow- 
able results take values in the spectrum spec(A) of a chosen observable A. Given the 
state p, the value a G spec(A) is observed with probability Tr[pP a ]. Consequently 
the expectation of an observable A is given by (A) = Tr[pA]. 

Conditioning. Suppose that a measurement of A gives rise to the observation 
a G spec(A). Then we must condition the state in order to predict the outcomes of 
subsequent measurements, by updating the density matrix p using 

This is known as the "projection postulate" . 

Evolution. A closed (i.e. isolated) quantum system evolves in a unitary fashion: 
a physical quantity that is described at time t = by an observable A is described at 
time t > by A{t) = U(t)*AU(t), where U(t) is a unitary operator for each time t. 
The unitary is generated by the Schrodinger equation 

ihj t U(t)=H(t)U(t), (2.6) 

where the (time dependent) Hamiltonian H (t) is a sclf-adjoint operator for each t. 
Before continuing, we make the following remarks. 

Remark 2.1. (Pure states). The set of density matrices p is convex; we can thus 
wonder what are the extremal points in this set, i.e. those that correspond to the most 
informative states. It is not difficult to show that the set of extremal density matrices 
is the set of projections onto one-dimensional subspaccs. Thus we can specify any 
extremal state uniquely (up to a phase factor e %v ) by a single unit vector ip G H in 
the corresponding subspace, and Tr[pA] = (ip,Xip) for any operator X. In classical 
probability theory, the set of probability measures is also convex and the extremal 
measures are deterministic (Dirac) measures. In the quantum mechanical setting, on 
the other hand, the Heisenberg uncertainty relation implies that even extremal states 
do not give deterministic measurement outcomes for all observablcs. 

Historically, and in most textbooks, quantum mechanics is first formulated in 
terms of the extremal states (called pure states) and the description is later generalized 
to density matrices (mixed states). The Schrodinger wavefunction ip(q, t) is an example 
of a pure state vector in an infinite-dimensional Hilbert space setting. □ 

Remark 2.2. (Heisenberg vs. Schrodinger picture). In the above description of 
time evolution we work with a fixed state while the observables change in time. This 
conforms to the usual treatment in classical probability theory, where the underlying 
probability measure is fixed at the outset and the random variables are time dependent 
(stochastic processes). In quantum mechanics this is known as the Heisenberg picture; 
equally (or perhaps more) popular is the Schrodinger picture, in which the observables 
are considered fixed and the density matrix evolves as p(t) = U(t)pU(t)*. The two 
pictures are essentially equivalent as Tr[pA(£)] = Tr[/j(t)A] for any observable A. 

Note that if we start in a pure state, then unitary evolution preserves this prop- 
erty; in terms of the state vector, ip(t) = U(t)ip. Intuitively, this enforces the physical 
idea that no information is lost from an isolated system. Together with (|2.bj) we ob- 
tain the traditional Schrodinger equation for ip(t), of which l|2.2|) is a special case (for 
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a specific choice of H, in infinite dimensions). We will always work in the Heisenbcrg 
picture, however, as we will be dealing with (quantum) stochastic processes. □ 

As a basic illustration we discuss the following simple example. 

Example 2.3. One of the classic experimental demonstrations of the necessity 
of quantum mechanics was performed in 1922 by Stern and Gerlach. A silver atom 
is subjected to an inhomogeneous magnetic field. The atom possesses an intrinsic 
magnetic moment, and hence experiences a force that is proportional to the component 
of its magnetic moment in the direction of the field gradient. As Stern and Gelach 
did not prepare the atom in a particular orientation, they expected it to be deflected 
randomly in a continuous range of directions corresponding to a random orientation 
of the magnetic moment. Repeated runs of the experiment showed, however, that the 
atom is randomly deflected into two discrete directions only — the reason being that 
in quantum mechanics the magnetic moment (or spin) observable is discrete, rather 
than continuous. Atoms deflected in the upper direction are said to have "spin up", 
while those in the lower direction have "spin down" . 

A simple model of a spin is as follows. Let H = C 2 , and consider the observable 

- ( ; -. ) <"» 

representing spin in the z direction. We have spec(a z ) = { — 1, 1}, which correspond 
to spin down and spin up, respectively. In terms of the eigenprojections 

P*A = ( J o ) ' Pz ^ = ( 1 ) ' 

we can write a z = P z ,\ — Pz.-i- The next step is to introduce a state. Consider a pure 
state, given by the vector ip = (ci c_i) T with |ci| 2 + |c_i| 2 = 1. If we observe cr z , we 
obtain the outcome 1 (spin up) with probability (ip, P Zt iip) = |ci| 2 , or the outcome 
— 1 with probability (ip, P z _iip) = |c_i| 2 . □ 

2.2. A first look at quantum probability. The description of quantum me- 
chanics in the previous section contains the rudiments of a viable probability the- 
ory. We will now formalize these ideas, once again restricting ourselves to the finite- 
dimensional case for simplicity (the general theory, which will be discussed in |j3j is 
conceptually very similar). Two key ideas, which we elaborate on below, form the 
essence of the formalism: the first is that a set of measurements made in a single 
realization 1 of a quantum experiment corresponds to a particular choice of a commu- 
tative algebra of observables; and the second is that any such commutative algebra is 
entirely equivalent to a classical (Kolmogorov) probability model. 

A classical probability model is described by a probability space (fi, T, P). Here 
f2, the sample space, is not of essential importance; the basic ingredients of the theory 
are the events that can occur, contained in the tr-algebra T ', and their probabilities, 
which are determined by the measure P. Equivalently, we could describe an event F G 
T by a random variable \f which takes the value 1 if F occurs and otherwise (the 
indicator function on F), and the probability of the event is simply the expectation 
of xf- We have already encountered such objects in the previous section: events are 
precisely those observables that are projection operators (P = P* = P 2 ), and the 

J By a realization or an experiment we mean that random variables are assigned a definite value, 
as is the case if we perform measurements on a single physical system. In classical probability this 
corresponds to the choice of a sample point w £ Q; the quantum case is a little more subtle. 
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probability of an event P is given by P(P) = Tr[pP]. Thus the set of projections, 
together with the linear map P, play much the same role as the classical pair T , P. 

We run into trouble in the quantum case when we try to ascribe joint probabilities 
to certain events. This is always possible in classical probability theory: the joint 
probability of the events A and B is P(AOB) = Ep(xaXb)- But given two projection 
operators P, Q the operator PQ is not guaranteed to be a projection or even an 
observable ((PQ)* = QP), unless P and Q commute. This simple observation is no 
coincidence; it has the following physical interpretation: in a single realization of a 
quantum probability model, we can only verify the truth of a set of commuting events. 
This is in contrast with classical probability where in every realization any event is 
cither true or false, whether we choose to observe it or not. In quantum probability 
we can a priori choose to verify the truth of an arbitrary event, but subsequently some 
of the other events (those that do not commute with the observed event, said to be 
incompatible) become meaningless within the same realization. 

The incompatibility of events is a significant conceptual departure from classical 
probability, and requires a little getting used to. In many ways, however, this is the 
only essential departure from classical probability theory. We now begin to contruct 
the mathematical formalism of quantum probability, and we will show that it is indeed 
very close to Kolmogorov's theory. 

Consider the following idea. Suppose we decide to measure an observable A and 
obtain a particular outcome a € spec(A). Then we do not need to perform another 
measurement to know that any function f(A) would give the outcome /(a); in essence, 
this is merely a relabeling of the measurement outcomes of A. Indeed, 

a= J2 aP * => f( A )= E f^ p *> ( 2 - 8 ) 

a£spcc(j4) aGspcc(A) 

and all such operators commute with each other. Thus measuring A "automatically" 
measures all functions f(A). The set of operators si = {X : X = f(A), / : R — s- C} 
forms a commutative *-algebra, i.e. arbitrary (complex) linear combinations, products 
and adjoints of operators in si are still in si ', I £ si, and all elements of si commute. 
We will call si the *-algebra generated by A 2 . A linear map P : si — > C that is 
positive (P(A) > if A > 0) and normalized (P(I) = 1) is called a state on si 
(clearly we can always write such a state asi^ Tr[pA] for some density matrix p). 
Note that the projections P £ si are precisely those events that we can distinguish 
by measuring A, and P(P) gives their probabilities. We can similarly generate the 
commutative *-algebra of functions of an arbitrary set of commuting observables. 

The algebraic structure we have introduced is of fundamental importance as it 
provides us with a direct connection to the classical theory, as follows: 

Theorem 2.4 (Spectral theorem, finite-dimensional case). Let si be a commu- 
tative *-algebra of operators on a finite- dimensional Hilbert space, and letW be a state 
on si. Then there is a probability space P) and a map l from si onto the 

set of measurable functions on f2 that is a * -isomorphism, i.e. a linear bisection with 
l(AB) = l{A)l(B) (pointwise) and l(A*) = i{A)* , and moreover P(A) = E P (i(A)). 

Proof. The proof is an elementary exercise in linear algebra. As the Hilbert space 
H has dimension n < oo, we can without loss of generality suppose that H = C ra and 
that si is a commutative *-algcbra of complex n x n matrices. As all the elements of 
si commute, we can find a unitary matrix U such that U* AU is a diagonal matrix for 



2 In fact, it is the smallest *-algebra of operators that contains A. 
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every Ae / Let = {1, ... , n}. Define t(A) : Q -> C by t(A)(i) = (C/M?7) i4 for 
every A G Next, define J 7 = cr{t(A) : A e ^/}. Finally, define P(S) = P(t -1 (xs)) 
for every S E We have now explicitly constructed P) and t. □ 

Evidently the commutative ^-algebra structure is completely equivalent to clas- 
sical probability theory; by simultaneously diagonalizing all the operators in the al- 
gebra, we obtain an explicit representation of measurable random variables as the 
functions on the diagonals. We also note the following. Suppose we are given some 
(large) commutative *-algebra and consider a subalgebra 38 C srf generated by a 
single element B € If we apply the map i to SB, we obtain precisely the subset of 
functions on f2 that are measurable with respect to o~{l(B)}. Thus subalgebras play 
the same role in quantum probability as sub-tr-algebras in classical probability; they 
allow us to keep track of particular subsets of information. 

We do not a priori have a basis for specifying a particular commutative *-algcbra; 
given a quantum system, we could decide to measure any of a large set of incompatible 
observablcs. The discussion up to this point motivates the following definition. 

Definition 2.5 (Quantum probability space, finite-dimensional case). A pair 
(yY,Y), where jY is a *-algebra of operators on a finite- dimensional Hilbert space 
and P is a state on jY , is called a (finite- dimensional) quantum probability space. 

Usually we will choose JV to be the set of all (bounded) operators 33 (Y\) on 
some underlying Hilbert space H. The principles of quantum probability now boil 
down to the following. In each realization, we must make a choice of commutative 
*-subalgebra stf C JV which fixes the observations. Every statistic that pertains to 
these observations (e.g., the statistics compiled by repeating the experiment many 
times with the same choice of srf) is now described by the classical probability model 
obtained through the spectral theorem. The reader should convince himself that the 
operational description given in the previous section fits neatly within this model 
(with the exception of conditioning, which we discuss ^12.4(1 . 

Notice that in contrast to a classical probability space (fi, T, P), there are no 
sample points w e O in a quantum probability space. The sample points emerge 
through the spectral theorem after the choice of a commutative *-subalgebra. 

Example 2.6. Let us reformulate Example 12.31 Set H = C 2 and choose 
jY = 33(H) = M 2 , the *-algebra of 2 x 2 complex matrices. The pure state is defined 
by ¥(A) = {4>,Aij)) = i/>*Aip (recall that ip = (ci c_i) T with \ Cl \ 2 + |c_i| 2 = 1). 

The observable o~ z , used to represent spin measurement in the z direction, gener- 
ates a commutative *-subalgcbra C jY . It is not difficult to see that stf z is simply 
the linear span of the events P z .\ and P z ,-i- Let us now apply the spectral theorem; 
we obtain the probability space P) where O = {1,2}, T = {0, {1}, {2}, Q}, 

= I c i| 2 j e tc-> an d t-{Pz,i) = X{i}> t-(Pz.-i) = X{2}- In particular, the random 
variable l(o~ z ) '■ (L 2) i— > (1, —1) has precisely the right properties. 

Now suppose we do not wish to measure the intrinsic angular momentum (spin) 
in the z-direction, but in the x-direction. This corresponds to the observable 

*.-(; ;), (2.9) 

which has the spectral decomposition o~ x = P x .\ — P x ,-i with 

P *>i = 2 ( 1 1 ) ' P ^- x = 2 ( -1 1 ) ■ 

The observable o~ x also generates a commutative *-subalgebra srf x — span{P Xi i, P x .-i} 
to which we can apply the spectral theorem. However, as o~ x and o~ z do not commute, 
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they cannot be jointly represented on a classical probability space through the spectral 
theorem. In other words, a x and a z are incompatible and their joint statistics are 
undefined; hence they cannot both be observed in the same realization. □ 
To conclude this section, let us say a few words about the interpretation of the 
Hciscnbcrg uncertainty relation. The relation says that the product of the variances 
of two noncommuting obscrvables is bounded from below by a positive constant. It 
is important to realize, however, that the two observables cannot be measured in 
the same realization as they are incompatible — in particular, the covariance of the 
obscrvables is undefined. Rather, the uncertainty relation is a statement about the 
properties of quantum states: for any state, the statistics of the two observables, com- 
piled in the course of separate realizations in each of which only one of the observables 
is measured, must obey the Heisenbcrg inequality 3 . 

2.3. Composite systems. We will often wish to form a composite probability 
model from two separate probability spaces. In classical probability theory, two prob- 
ability spaces (fiij-FijPi) and P2) can be merged into a single probability 
space (Oi x il 2 ,3~x x -7-2, Pi x P 2 ) where Pi x P2 is the product measure. We now 
briefly describe the noncommutative counterpart. 

Consider a composite system constructed from two quantum probability spaces 
(^fi,Pi), {JViJP'i) of operators on the Hilbert spaces Hi and H2, respectively. The 
composite quantum probability space consists of operators on the tensor product 
Hilbert space Hi <g> H 2 ; for vectors ipx,<fii G Hi and ip2,4>2 G H2, the inner product on 
Hi g) H 2 is given by 

("01 <8> Ip2,(/>1 ® 02> = (^l,0l)(^2,</>2), 

which is extended by linearity to any vector in the tensor product space. The algebra 
JV\ £g> jV 2 is generated by elements of the form 

{Ax ® A 2 ){ipx ® 1P2) = Axtpx ® ^2^2, 

where Ax G jVx and A2 G JV 2 - Finally, the product state is defined by 

(Pi g> P 2 ){Ax <g> A 2 ) = Pi(Ai)P 3 (A 3 ), 

and is extended by linearity. The quantum probability space {JVx <8> ^,Pi <S> P2) 
of operators on the Hilbert space Hi ® H2 describes the composite system. The 
reader should verify that if JVx and jV 2 are commutative, then applying the spectral 
theorem to the composite system is equivalent to applying the spectral theorem to 
the individual subsystems, then forming the composite classical probability space. 

2.4. Conditional expectations. Let us recall for a moment the Stern-Gerlach 
experiment of Examples 12 . 31 and 12 . 61 We have introduced the observables a z and a x , 
corresponding to spin in the z and x directions. These observables are incompatible, 
so we cannot measure them in the same realization. Recall that in order to measure 
er 2 , Stern and Gerlach apply a field gradient in the z direction; the atom then acquires 
momentum in that direction proportional to <r z , and we can determine the value of o z 

3 In the physics literature one often find statements to the effect that the Hciscnbcrg uncertainty 
relation limits the precision with which we can "imperfectly" observe two noncommuting observables 
simultaneously, i.e. within the same realization. This is a misconception. Though the idea of an 
imperfect measurement can be implemented rigorously (e.g. 1371 *). this gives rise to an uncertainty 
relation which is different than Heisenberg's uncertainty relation 0], 
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in that realization by observing whether the atom is deflected up (1) or down (— 1). 
Similarly, o~ x is measured by orienting the field gradient along the x axis. 

We wouldn't be measuring both a z and a x by applying both field gradients si- 
multaneously: rather, as magnetic fields add vcctorially, this would measure the spin 
in some other direction in the x-z plane whose observable commutes with neither a z 
nor a x . On the other hand, we could first apply the field gradient in the z direction 
until we can resolve o~ z , then turn this field off and switch on a field in the x direction 
to resolve o~ x . It is a characteristic feature of quantum mechanics that the measure- 
ment outcomes in such a procedure can differ drastically depending on what order 
we apply the fields. It is thus of crucial importance to specify precisely how such 
measurements arc performed by including in the quantum probability space a model 
of the measurement apparatus (or probe). 

We defer the discussion of the Stern-Gerlach measurement with magnetic fields 
until we have developed the necessary machinery in For sake of example, we 
develop in this section a simpler probe model which shows the main features of the 
procedure. We will see that this probe model, together with the concept of conditional 
expectations, reproduces precisely the traditional projection postulate of ^2.11 

Let us begin by discussing conditional expectations in the noncommutative con- 
text. The key observation we need is the following. The conditional probability of 
an event B given an event A is the probability that B is true given that A is true 
in the same realization. Hence the concept of conditioning inherently makes sense 
only in the context of quantities that can be observed in the same realization of an 
experiment. This means that we can only define conditional expectations in commu- 
tative subalgebras of a quantum probability space; but as long as we are restricted 
to the commutative case, the spectral theorem allows us to define any probabilistic 
operation directly in terms of the associated classical probability space (see [T%]h 

To be more precise, let (jY ,P) be a quantum probability space, srf C JV a com- 
mutative subalgebra and B £ jV a self-adjoint element commutes with every A £ s$ ' . 
Then B and srf generate a larger commutative subalgebra ^ C jY ', to which we can ap- 
ply the spectral theorem to obtain a ^-isomorphism i. The conditional expectation is 
now simply inherited from the classical space as P(B\£/) = L~ 1 (Ep(t,(B)\a{t(£/)})). 
Note, however, that if B, C are two self-adjoint operators that commute with ev- 
ery A £ s/, this does not necessarily imply that B and C commute. The set 
srf' = {B e Jf : AB = BA MA £ s/}, the commutant of sf (in JY), is the largest 
*-subalgebra of operators that can be conditioned on sf. The conditional expectation 
is defined as above for its self-adjoint elements, and extends to all of s/' by linearity. 

From this discussion and the definition of the classical conditional expectation, 
we extract the following definition directly in terms of the quantum probability space. 

Definition 2.7 (Conditional expectation, finite-dimensional case). Let ( t /K,P) 
be a finite- dimensional quantum probability space and let si C jV be a commutative *- 
subalgebra. ThenP(-\si) : si' —> sf is called (a version of) the conditional expectation 
from sf' onto sf if¥(V(B\sf)A) = P(BA) for all A £ sf , B £ si' . 

As we will see in the discussion above generalizes directly to the infinite- 
dimensional case. In finite dimensions it is convenient to give an explicit expression 
for the conditional expectation. Note that a finite-dimensional ^-algebra is a finite- 
dimensional linear space. Then (A, B)p = ¥(A*B) turns the algebra into a pre-Hilbert 
space, i.e. it is a Hilbert space except that A i— ► (A, A)r = \\A\\p may have a nontrivial 
null space. In particular, the fundamental property ¥(P(B\s/)A) = P(BA) for all 
A £ s/ is precisely that of orthogonal projection from s/' onto the linear subspacc s/, 



AN INTRODUCTION TO QUANTUM FILTERING 



11 



which in a pre-Hilbert space is uniquely determined up to an event of zero probability. 
Note that the classical characterization of ¥(B\si) as the least-mean-square estimate 
of B in si follows immediately. We will elaborate on this point in 

An explicit expression for ¥(B\s/) is easily obtained if we find an orthogonal 
basis for si . Any commutative *-algcbra in finite dimensions is spanned by a set of 
projections that resolve the identity. This is easily seen: in n dimensions any self- 
adjoint operator is a linear combination of at most n projections that resolve the 
identity, and as all the operators in the *-algebra commute they must be expressible 
as linear combinations of the same projections. Let si = span{P a } for some set of 
projections P a . Then a version of the conditional expectation is given by 

p(b\s/)= y -L-(-L-,b) = y ^^p. (2.10) 

^ \\p p \ \\p p /p ^ p(-p) 

Note what could happen if we naively fill in some B g" si' . Then (P, B)p ^ (B, P)p 
for some P G {P a }> which implies that we obtain complex coefficients in the sum even 
if B is an observable. Hence the expression does not make sense unless B G si' . 

Example 2.8. The following example serves to illustrate conditional expecta- 
tions; it is not meant to represent a particular physical scenario. Consider H = C 3 , 
Ji = Ah and ¥(X) = (tp, Xijj) with ^ = (11 l) T /\/3. Define A, B G Ji by 



A= 4 0=4010+5000, B 





Let si be the *-algcbra generated by A. Then 

a, b, c, d, x G C 





b 






d 


•) 










Note that si' is not a commutative algebra, despite that every element of si' com- 
mutes with every element of si . As B G si' , we can use (|2.10|) to calculate 



P(B\si)=\ 1 =10 1 +2 6^ 





The observable P(B\si) is the orthogonal projection of B onto si with respect to the 
inner product (A, B)p = P(A*B). By the projection theorem, V(B\si) is an element 
of si that minimizes the mean square error \\B — P(B|^)||p. □ 

We now proceed to develop a simple probe model that reproduces the projection 
postulate. Recall that the conditional probability of an event P given a commuting 
event Q is simply given by V(PQ)/¥(Q). This is equivalent to P(An B)/P(B) by 
the spectral theorem, where A and B are the sets corresponding to P and Q. 

Example 2.9. (Simple probe model). We will work in a generic n-dimensional 
setting, n < oo. Let H = C™, JV = M n (the set oftixn complex matrices), and 
let ¥(X) = Tr[pX] be some state on jY ' . Let A, B be two observables in Ji that do 
not commute. Hence we cannot measure A and B directly in the same realization. 
However, we can have the system interact with an external probe system, in such a way 
that the observable A is copied to some probe observable A' after the interaction. If 
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A' commutes with B, wc interpret this procedure (like in the Stern-Gcrlach example) 
as an (indirect) measurement of A followed by a (direct) measurement of B. 

The strategy is simple. First, we describe the probe system by a separate probe 
quantum probability space (</Kp,P p ) and form the composite space (,yK(g>,y>£,P(g>Pp). 
Next, we introduce an interaction. Recall from H2. II that the evolution of an isolated 
system is described by a unitary transformation. Hence, we will choose a probe 
observable 7 ® A' and construct a suitable unitary operator U so that the probe 
observable U*(I <8> A')U after the interaction gives the same outcome as A ® 7 would 
have before the interaction. Note that by construction, the system observable B ® I 
commutes with I ® A' after the interaction, [U*(I <g> A')U, U*(B (g> I)U] = 0. Hence 
we can measure them within the same realization. 

We now fill out the details of this model. Let A = J2 a <£s P cc(A) a P^ an< ^ wc 
denote by m the number of elements in spec(A) (the number of possible measurement 
outcomes). For the probe algebra, we choose H p = C m , jV p = M m . Now fix an 
observable A' <S jY p that has m distinct measurement outcomes. Note that A' = 
Ea6spcc(A') a P'a an d that P' a are projections onto one-dimensional subspaces of H p ; 
hence we can fix an orthonormal basis of vectors ip a € H p such that P' a = ipa^a- 
Now define the operator X' ab = ipbip a + V'aV'fc + J2 C jt a ,b V'cV'c S for a 7^ 6, and 
X' aa = 7; these operators switch the events P' a and P' h in the sense X' ab P' a X' ab = P b , 
KbPlKb = PL, and X' ab P' c X' ab = P' c for c^a,b. Finally, set P p (X) = Tr[P>] where 
we have fixed some p £ spec(^4') at the outset. 

Now consider the operator U e JV ® JV P defined by U = X) a e sp cc(yi) P& ® X' ap . 
As (X' ap ) 2 = I it follows that U*U = UU* = U 2 = I, i.e. U is unitary. Note that 

17* (7 ® P' C )U = P c ® P; + (1 - P c ) ®P' c lic^p, U*{I (» PPU = Ea P a® P'a- Wc 

calculate (P ® P P )(U*(I <g> P' C )U{P C ® 7))/(P (g) P P )(P C ® 7) = 1 for every c, i.e., the 
conditional probability that C/*(7 ® A')Z7 gives the outcome c, given that we have 
observed A Cg> 7 with outcome c, is one. Thus the unitary interaction U precisely 
copies the system observable A onto the probe observable A' . 

We can now measure the system observable B after interaction with the probe. 
In particular, let us calculate the expectation of B conditioned on the probe measure- 
ment. Define srf as the commutative *-algebra generated by U*(I <B> A')U, and note 
that U*(B ® I)U e si' . Thus we can use (|2TTU)l to calculate 



where p c = P c pP c /Tr[pP c ]. This is precisely the projection postulate of H2.ll 

This example may be somewhat bewildering, and we encourage the reader to 
work through the procedure for a particular model (e.g. that of Example 12.8(1 . paying 
particular attention to which operators do and do not commute. The reader should 
convince himself that different answers are obtained if one first measures B, then A. 

Finally, we note that though we have here measured A through a probe and B 
directly, there is no reason to stop here. If, in addition to A and B we want to measure 
an observable C that does not commute with 7?, we would introduce a second probe to 
measure B as well. Now suppose that C = A. If we first measure A through the probe, 
then measure .4 again we would (obviously) obtain the same outcome. However, if 
we first probe A, then probe B, and then measure A, we obtain a different outcome 



(P ® ¥ P )(U*(B ® I)V\sf) = 



E 



c 
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than that of the first measurement of A\ The reader is encouraged to work out also 
this case. The reason for this phenomenon is that the interaction with the probe that 
is used for the observation of B disturbs the system in such a way that its value of A 
is changed. This effect is known as "measurement back action" . □ 

The previous example, in particular the construction of the probe and the cor- 
responding interaction, may seem rather ad hoc, and indeed we have only chosen 
this rather artificial example to reproduce the projection postulate. This is not a 
shortcoming of the theory we have outlined, however, but rather highlights the im- 
portance of including a reasonable model of the probe in the quantum probability 
space. Indeed, most realistic measurement setups are not of this type and the projec- 
tion postulate of H2. II cannot be used to describe such systems. For example, we will 
see in [JUthat the Stcrn-Gerlach measurement is only approximately described by the 
projection postulate. Later we will describe even more complicated optical measure- 
ments in which we wish to condition system observables based on the observation of 
stochastic processes in continuous time (the signal from a photodetcctor). It is the 
latter, most practically useful case where we need quantum filtering theory. 

Remark 2.10. It is important to realize that statements like the projection 
postulate do not really implement the notion of conditioning; they consist of a pure 
conditioning component and of a particular physical probe model which has no sta- 
tistical significance. One also finds in the literature generalizations of the projection 
postulate, called instruments, which implement different types of probes |23l I39j . In 
the quantum probability context of this paper it is most natural to separate the two 
parts; we will take existing probe models from physics, and concentrate on the calcu- 
lation of the associated conditional expectations (filtering) . □ 

3. Noncommutative probability theory. In the finite-dimensional case, we 
have seen in [yjthat quantum mechanics can be modeled as a noncommutative proba- 
bility theory. In this section we present a general formulation for quantum probability 
that has wide applicability. We give a general definition of quantum probability space, 
prove the existence and uniqueness of conditional expectations, and prove a quantum 
version of Bayes' rule that is very helpful for quantum filtering. 

Almost all of the features of the full theory can already be seen in the finite- 
dimensional case discussed in the main difficulties in the general case are the tech- 
nicalities involved in the theory of infinite-dimensional Hilbcrt spaces. This parallells 
the difficulties in classical probability theory — though finite-state random variables 
can be treated by almost trivial (counting, combinatoric) methods, the description 
of continuous random variables requires us to upgrade our machinery using meth- 
ods of real analysis. Similarly, the elementary linear algebra that underlies finite- 
dimensional quantum probability must be upgraded to functional analysis if we wish 
to treat the infinite-dimensional case. Conceptually, however, the two cases are very 
similar, and the reader is encouraged to develop an intuitive understanding of the 
finite-dimensional case before tackling the full formalism. For a thorough introduc- 
tion to functional analysis we refer to the excellent textbook |55j . 

3.1. Quantum probability spaces. Let H be a complex Hilbert space, and 
denote by 38(H) the set of all bounded (linear) operators on H. We restrict ourselves 
(for the time being) to bounded operators as we wish to construct *-algebras of such 
operators: attempting to do this with unbounded operators would get us into no end 
of trouble, as we would surely run into domain problems. Recall that for A <G 33(H), 
the usual Hilbert space adjoint A* g 38(H) is defined by (ip, A<f) = {A*ip, (f>) G H. 
With this involution 33(H) is a *-algcbra in the sense of ^21 
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We wish to introduce a structure that plays the same role as a *-algcbra in the 
finite-dimensional case. It turns out, however, that the ^-algebra structure in itself 
is not sufficient in the infinite-dimensional case; we need to impose an additional 
technical condition in order to be able to prove an infinite-dimensional version of the 
spectral Theorem 12.41 The additional condition has a natural interpretation which 
we will discuss below; however, the reader should not be too worried about this 
technicality, particularly if he is not familiar with nets or locally convex topologies. 
In practice we will rarely need to verify this property directly. 

Definition 3.1. A positive linear functional /i : @)(Y\) — > C is said to be normal 
if fi(swp a A a ) = sup a n(A a ) for any upper bounded increasing net {A a } of positive 
elements in £8(Y\). The locally convex topology on &(H) defined by the family of 
seminorms {A i— ► : /i normal} is called the normal topology. 

For a detailed discussion of nets, locally convex topologies, etc., see |55| . 

Definition 3.2 (Von Neumann algebra). A von Neumann algebra jV is a *- 
subalgebra of 23{\\) that is closed in the normal topology. A state P on jY is normal 
if it is the restriction to jY of a normal state on £${\X). 

We can now extend the spectral theorem to the infinite-dimensional case, es- 
sentially showing that commutative von Neumann algebras with normal states are 
equivalent to classical probability spaces. See e.g. [23 Proposition 1.18.1] for a proof. 
Conceptually, we are guided by the finite-dimensional case; Theorem 13.31 extends the 
idea of simultaneous diagonalization to infinite-dimensional operators. Though tech- 
nically much more involved, the flavor of the procedure remains the same 4 . 

Theorem 3.3 (Spectral theorem). Let ^ be a commutative von Neumann al- 
gebra. Then there is a measure space (0,J-*, /x) and a ^-isomorphism i from ^ to 
i°°(r2, T , fx), the algebra of bounded measurable complex functions on SI up to fi-a.s. 
eguivalence. Moreover, a normal state P on defines a probability measure P, which 
is absolutely continuous with respect to fx, such that P(C) = Ep(t(C)) for all Cef. 

Before we continue, let us demonstrate the significance of the additional technical 
conditions on a von Neumann algebra. First, we give an example of a *-subalgcbra 
of 33iy\) that is not a von Neumann algebra. 

Example 3.4. Let H = L 2 ([0, 1]) and si = C([0, 1]), the commutative algebra 
of continuous functions on the unit interval. We can consider A £ si as an operator on 
H under pointwise multiplication, i.e. (Aip)(x) = A(x)^(x) for every if; £ H. Then si 
satisfies all the requirements of a von Neumann algebra except that it is not closed in 
the normal topology. Indeed, one can construct, for example, an increasing sequence 
of continuous functions that converges to X[o,i/2]j which is discontinuous. 

The problem is that the only indicator functions in s/ are Xsz and X[o,i] : other 
indicator functions on [0, 1] are discontinuous. Hence from a probabilistic point of view 
si defines a trivial theory, as the only events in si are the trivial ones. Nonetheless si 
is much larger than the algebra C that is generated by \z> an d X[o,il- Hence si cannot 
be ^-isomorphic to the set of measurable functions on some measure space. The role 
of normal closure is to avoid this complication. Indeed, this property guarantees that 
any von Neumann algebra is generated by its projections |44| . □ 

Like normal closure, normality of the state is also required in order for the spectral 



4 The additional measure /i that shows up in the theorem has no direct physical significance; its 
job is to identify "enough" null sets in L°°(Q) so we can construct the *-isomorphism i. We can 
generally not use P for this purpose as there may be projections P G V with ¥(P) = 0; if i were 
to map to L°°(Q,J r , P) then necessarily = and hence i would not be invertible. The precise 
details of the construction are never an issue, as we will never use /i and only prove results P-a.s. 
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theorem to hold. Note that for normal states the expectation of an increasing set of 
obscrvables converges to the expectation of their least upper bound, i.e., the monotone 
convergence property holds. This corresponds to the more basic property of countable 
additivity. In the following example we construct a state which is not normal. 

Example 3.5. Let H = £ 2 (N) and si = £°°(N), acting on H by pointwise 
multiplication, si is closed in the normal topology, i.e. it is a commutative von 
Neumann algebra. Now introduce a state on si which is given by the expression 5 

1 N 

W(A)= lim — y^AOn), A e <3 c si (3.1) 

n—1 

on a suitably chosen linear subspace P is not a normal state; to see this, let us 
introduce the events P n £ si defined by (P n ip)(k) = ip(k) if k < n, and zero otherwise. 
{P n } is an increasing sequence of projections in si whose least upper bound is the 
identity P^ = I. However, straightforward calculation shows that P(P„) = for any 
finite n, whereas P(J) = 1. We conclude that the state P is not normal. 

Note that what we have constructed is precisely the classical model of a uniform 
distribution over the natural numbers N. This does not give rise to a well-defined 
probability model in the sense of Kolmogorov, however, as the uniform distribution 
on N does not obey the property that the probability of a countable union of disjoint 
events is the sum of the probabilities of these events (which is exactly what went 
wrong above). Requiring that the state be normal is equivalent to requiring that it 
gives rise to a countably additive measure |45| . which rules out our example. □ 

Remark 3.6. Def. 13.21 is one of many equivalent definitions of a von Neumann 
algebra. We have emphasized normality as it is close to the probabilistic notion of 
monotone convergence. Normal closure turns out to be equivalent to closure in several 
other topologies, notably the weak and strong operator topologies on ^(H). We will 
not concern ourselves with topological issues in this article; see e.g. sec. 2.4]. 

The following definition should come as no surprise. 

Definition 3.7 (Quantum probability space). A quantum probability space is a 
pair (o/K,P), where .A' is a von Neumann algebra and P is a normal state. 

The structure has precisely the same interpretation as in §2, of which we briefly 
remind the reader. In each realization we must choose a commutative von Neumann 
subalgebra si C Ji which fixes the observations. Every statistic that pertains to these 
observations is then described by the classical probability model obtained by applying 
the spectral theorem to (si,¥). The equivalence between commutative quantum 
probability spaces and classical probability spaces is the foundation of the theory; a 
commutative quantum probability model is a classical probabilistic model, and we 
will often implicitly identify these two pictures. 

In this article we will only use three types of von Neumann algebras. We list 
these below; they will be used throughout without comment. 

(i) si = 38(\\) is a von Neumann algebra. Moreover, any vector state on si 
(P(-A) = (ip^'ip) for fixed ip £ H), or any convex combination of vector states, is a 
normal state. Many models from quantum mechanics are described by such a model. 

(ii) si = L°°(£l, P), acting on H = L 2 (il, T, P) by pointwise multiplication, is a 
commutative von Neumann algebra. Moreover, any state of the form ¥(X) = Ep(X) 
is a normal state. This is a classical probability model. 

5 Eq. 13.11 does not by itself define a state, as there are many A £ si for which the limit does 
not exist. However, note that Eq. 13.11 is well defined on a linear subspace, e.g. 2* = {A £ si : 3c 6 
C s.t. linin—,00 A(n) = c}. Now P can be extended from 2 to si using the Hahn-Banach theorem. 
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(iii) Given & C 38(H), recall that = {X G 38(H) : XS = SX, V5 G J^} is 
called the commutant of 5? in 38(H). The following theorem (see ^] Theorem 5.3.1] 
for a proof) allows us to construct von Neumann subalgcbras of 38(H). 

Theorem 3.8 (Double commutant theorem). Let 5? C 38(H) be any self-adjoint 
set, i.e. S G => S* G J^. TTien .e/ = ,5^" is the smallest von Neumann subalgebra 
of 38(H) that contains . In 'particular, J? is a von Neumann algebra iff <9" = 5?" . 

Given any 5? C 38(H), we call vN(J^) = (S" U J?*)" the von Neumann algebra 
generated by o5^. We will repeatedly use this construction in the following. For ex- 
ample, suppose that we decide to measure in one realization some commuting set of 
observables A\, . . . , A n . Then s/ = vN(Ai, . . . , A n ) is a commutative von Neumann 
algebra which, through the spectral theorem, describes the associated classical prob- 
ability model, s/ is the quantum probability equivalent of the cr-algcbra generated 
by a set of random variables. 

3.2. Random variables. Now that we have a general definition of a quantum 
probability space, we can develop some tools to deal with random variables. Recall 
from 32 that any self-adjoint clement of a quantum probability space can be decom- 
posed into events using Eq. (|2.4[) . which gives its interpretation as an observable 
(random variable). Let us show how to do this in the infinite-dimensional case. 

Let (o/K, P) be a quantum probability space and consider an element A G jV which 
is self-adjoint A = A*. Then sf = vN(A) C JV is a commutative von Neumann 
algebra. By the spectral theorem, there is a probability space (S7, JF, P) and a He- 
isomorphism l that maps A to some (measurable) random variable a : £1 — > R. We 
can now do classical probability theory; in particular, for any Borel set B G B we 
have the event [a G B] = {ui G £1 : a{u) G B} = a^ 1 (B) G T. To map this event 
back to sf we simply invert t; the projection corresponding to [a G B] is denoted by 
Pa{B) = t _1 (x[ ae s]), and we call the map Pa from B to the projections in ,JV the 
spectral measure of A. But this object is a familiar one from functional analysis 55 ; 
in fact, it is well known that we can express A in terms of its spectral measure by 



where the integral is defined in a suitable sense [S3] ■ Eq. (|3.2H is precisely the infinite- 
dimensional counterpart of Eq. (|2.4|l . We emphasize the physical interpretation of 
Pa(B): it is the event [A takes a value in B], which occurs with probability P(Pa (£?)). 

This would be all there is to it, were it not for the fact that our algebras contain 
only bounded operators (recall that unbounded operators cannot be defined on the 
entire Hilbcrt space, and hence cannot be added or multiplied at will). Evidently we 
didn't lose much by this choice, as the probabilistic model is already contained in an 
algebra of bounded operators by the spectral theorem. An unfortunate side effect, 
however, is that self-adjoint operators in the algebra can only represent bounded ran- 
dom variables, whereas many observations of interest are quite naturally unbounded 
(think of a Gaussian random variable). This means that we need to deal with un- 
bounded observables separately. We briefly discuss one way of doing this. 

Consider a von Neumann algebra jY C 38(H). In general, an observable is defined 
by a (not necessarily bounded) self-adjoint operator A on some dense domain in H. We 
need to relate the unbounded operator A to JV . The trick we use is remarkably simple: 
we compute a bounded function of A. Define Ta = (A+il)^ 1 . By elementary spectral 
theory |55| . any self-adjoint A has a real spectrum, and hence A + il is invertiblc with 
bounded inverse. We say that A is affiliated to JV if Ta G jV . This is the equivalent 




(3.2) 
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of the classical notion of a random variable that is measurable with respect to some 
er-algebra Q. Note that every self-adjoint A is affiliated to 3S{H), and if A is also 
bounded then A is affiliated to jV iff A £ jV . 

We wish to represent A as a classical (unbounded) random variable. To this end, 
define the von Neumann algebra generated by A as vN(A) = vN(Ta)- Now note that 
Ta commutes with its adjoint, hence vN(^4) is a commutative von Neumann algebra 
to which we can apply the spectral theorem. All we need to do is to "package" 
A into Ta, apply t, and "unpack" it on the other end; in other words, we define 
l(A) = l(Ta)~ 1 — i- Once we have done this, we can define a spectral measure Pa 
for A in the usual way, and indeed Eq. (|3.2|l still holds even for unbounded A |55| . 
We remark that A being affiliated to jV corresponds to the fact that Pa(B) £ JV for 
every B £ B; this is precisely the classical notion of measurability. 

Unbounded operators are a nuisance, but unfortunately they are a fact of life 
in mathematical physics. In this article, particularly in the later sections, we will 
occasionally add and multiply unbounded operators without justification; a detailed 
analysis of the operator domains is beyond our scope. Though this does not often 
cause trouble, the reader should keep in mind that a fully rigorous treatment must 
verify that any addition or multiplication of unbounded operators is indeed well de- 
fined. We quote one useful result: operators affiliated to a commutative von Neumann 
algebra can be added and multiplied at will ^] Theorem 5.6.15], [S3] . 

Example 3.9. We take H = L 2 (R) and Jf = £§{H). The vector 

V, £ H, = ^r 1 ^-^ cxp (-^^ 

defines the (pure) state ¥(X) = (ip,Xip). Now consider the self-adjoint operators 

(Q4>)(x) = xi>(x), (Pi>)(x) = -ih—^(x), 

ax 

which are prototypical observables for the position Q and momentum P of a quantum 
particle. Both are unbounded observables, but their domains include at least the set 
of smooth functions with compact support which is dense in L 2 (R). 

What random variables do these represent? We can read off from the definition 
that Q is a Gaussian random variable with mean [i and variance a 2 — as Q is already 
in "diagonal" form (Q is affiliated to L°°(R) C jY ), its spectral measure is given by 

{P Q {B)i>){x)= X B{x)^{x) 

and it is evident that ¥(Pq(B)) is a Gaussian measure with mean fi and variance 
g 2 . Alternatively, consider the characteristic function q(k) = P(e lh Q) of Q. Unlike Q, 
e %kQ j g a i3 0unc j ec i operator and we can directly compute 

/>oo 

q(k) = (V>, e ifcQ V) = {2^)~ 1/2 a~ 1 / e lkx e -(—^ 2 /^ dx = 



which is the characteristic function of a Gaussian random variable with mean /z and 
variance a 2 . Similarly, e p is a bounded operator, and we compute 

/>OC 

p(k) = ¥(e tkp ) = e ikP ij) = / ip(x) tp(x + Hk) dx = e"^ 5 ''^ 



which is the characteristic function of a Gaussian random variable with mean zero 
and variance h 2 /Aa 2 . Thus both Q and P are Gaussian random variables, but their 
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joint distribution is undefined as they do not commute. Note that we cannot choose 
a so that both Q and P have arbitrarily small variance: this is a manifestation of the 
Heisenberg uncertainty relation (compare Eq. (|2.3|) ). □ 

The following example plays a central role in the physics of harmonic oscilla- 
tors; we will encounter a very similar construction later for continuous-time quantum 
stochastic processes. We will need the following classic result (see e.g. 03] for a proof). 

Theorem 3.10 (Stone's theorem). Let ,jV be a von Neumann algebra and let 
{Ut}teR C JY be a group of unitary operators that is strongly continuous. Then there 
is a unigue self-adjoint A affiliated to JY ' , the Stone generator, such that Ut = e ltA . 

Example 3.11. Let H = £ 2 (N) and Jf = 38(H). Define the complete or- 
thonormal basis {i/) n , n = 0, 1, . ..} C H, where ip n (k) = 1 if k = n and il> n (k) = 
otherwise. Moreover, we define for every a € C the exponential vector e(a) € H by 
e(a)(k) = a k j\fk\, and we remark that the linear span D of all exponential vectors 
is dense in H. The normalized exponential vectors e(a)e~' Q I" 1 are called coherent 
vectors, and can be used to define the coherent states ¥ a (X) = (e(a), Xe(a)) e~' a . 

The simplest random variable we can investigate is defined by (\ij})(k) = ktp(k) — 
i.e. this is the natural diagonal operator affiliated to ^°°(N) C N. The spectral 
measure of A is given by (P\(B) r >j))(k) = XB(k)i/}(k), from which we obtain directly 

F a (P x (B)) = (e(a),P x (B)e(a))e-^ = £ ALLX. 

keB 

Thus evidently, A is a Poisson-distributed random variable with intensity \a\ 2 . 

Can we find other interesting observables affiliated to jVI In many cases, physi- 
cally relevant observables are found to be the Stone generators of particular unitary 
symmetry groups; see e.g. [23 for a lucid discussion. Let us try to implement this pro- 
cedure with the two-dimensional translation group. As a first attempt, let us define 
a translation operator by D 1 e{a) = e(a + 7) el Q l"/ 2 ~l a+7 l"/ 2 for 7 e C; the constant 
factor ensures that ||jD 7 e(a)|| = ||e(a)||, as must be the case for any unitary operator. 
Unfortunately, _D 7 is not in fact unitary; a straightforward calculation shows 

(e((3),D*D^e(a)) = (£» 7 e(/3), D 7 e(a)) = /« e 'Mr7)-M«*i) 

which contradicts unitarity D*D 7 = I, i.e. (e(/3), D*D 1 e{a)) — (e(/3),e(a)) = e^" a . 
To fix this, define the Weyl operator 

W 7 e(a) = e(a + 7) e M 2 /2-|a+7l 2 /2 e *i m (a* 7 ) = e ( Q + 7 ) e — ItI 2 /2 _ 

The Weyl operator is unitary, and provides a projective unitary representation |37j in 
the sense that W a Wj3 = W a +pe lIm ^ a \ Note that it is sufficient to define the action 
of W a only on exponential vectors; we can then extend to D by linearity, and as D is 
dense and W a is bounded the Weyl operators are uniquely extended to all of H. 

Now fix (3 £ C and consider the unitary group {Wtp}t£R- This group is con- 
tinuous (W t (3e(j) — ► e(7) as t — > 0) and hence by Stone's theorem, there exists a 
self-adjoint operator Bp such that Wtp = e ltBf> . Finding the distribution of the ob- 
servable Bp is straightforward, as the chararcteristic function of Bp is given by 

bp{k) =¥> a (W k p) = (e(a),e{a + k(3)) e -kP' a -k 2 \P\ 2 /^\ 2 = ^kimi** p)-k 2 \p\ 2 /2 _ 
Hence Bp is a Gaussian random variable with mean 2Im(a*/3) and variance |/3| 2 . 
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Our next task is to obtain an explicit representation of Bp. We proceed as follows: 



Bpe(a) = \—W t pe{a) 



0- 

if3*a e(a) — i ^ e ( a + tfl) 



One can verify explicitly that this expression makes sense, i.e. Bpe(a) G H. Note 
that we cannot extend Bp to all of H, as Bp is unbounded. However, we see that the 
domain of Bp contains at least the exponential domain D. 

Let us introduce the following notation. Define q = Bi, p = B-i, and a = 
(q + ip)/2. Note that q and p are self-adjoint by Stone's theorem, whereas a has the 
adjoint a* = (q — ip)/2. Moreover, we find that oe(a) = ae(a). But then 



(oe(a))(fc) = a ^= = VkTT " — = Vk + Te(a)(k + 1). 

This implies that we can extend the domain of a to include also the {ipn} by defining 
aipk+i — Vk + 1 ipk (where atpo = 0). Furthermore, from 



(i) m ,a*i)k) = (aip m ,ipk) = Vm8( m -i)k = Vfc+ l£ m (fc+i) 

we can read off a*ipk = Vk + 1 V'fc+i- a* is known as the creation (or raising) operator 
and a as the annihilation (or lowering) operator. 

Finally, note that A = a*a. From a classical probability point of view this is very 
remarkable indeed. Not only do both Poisson and Gaussian random variables emerge 
from the same state P Q , but there is even a continuous map q,p <—> (q—ip)(q+ip) /4 = A 
that transforms two Gaussian random variables into a Poisson random variable. One 
could never continuously transform a continuous classical random variable into a 
discrete classical random variable; however, we get away with it here because p, q 
and A do not commute with one another. Thus in each realization we can choose to 
measure cither a discrete or a continuous random variable, but not both. □ 

Remark 3.12. Though presented rather differently, the last two examples are 
in fact ^-isomorphic in the case that a 1 = ^ in the first example. For example, if 
a G R we can map p t— * 2 1 / 2 ftr 1 P, q i— » 2 1 / 2 Q, and P Q i— > P„ =2 i/2 a CT=2 -i/2. From the 
expression for bp(k) we see that in a coherent state both p and q must have the same 
variance. In the first example we allowed for the variance of Q to shrink, though this 
necessarily increases the variance of P. This results in a "squeezed state" which can 
also be introduced in the context of the second example. We will not construct such 
states here; in the following, we will only use coherent states. □ 

3.3. Conditional expectation. We now consider conditional expectations, fol- 
lowing the treatment of JH| . The following definition is identical to the one in 

Definition 3.13 (Conditional expectation). Let(^V,¥) be a quantum probability 
space and let si C JY be a commutative von Neumann subalgebra. Then the map 
¥(-\s/) : si' — > si is called (a version of) the conditional expectation from s/' onto 
si ifP(P(B\si)A) = ¥(BA) for all A e si , B e si'. 

We briefly recall the significance of si' . si is the algebra generated by our ob- 
servations: it must be commutative, as we cannot observe incompatible events in a 
single experiment. We now wish to find the conditional statistics of an observable 
B that is not affiliated to si. However, as we have already observed si, this is only 
sensible if B commutes with every element in si — there would be no physical way to 
test our predictions if we could not subsequently measure B in the same realization. 

Remark 3.14. Recall that if B — B* we can use the spectral theorem to obtain 
explicitly P(B\si) = i~ 1 (E-p(L(B)\o-{i(si)})). This representation extends even to 
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the case that B is an unbounded self-adjoint operator that is affiliated to si' . For 
simplicity wc will discuss below the properties of ¥(B\£/) assuming that B is bounded, 
but with suitable care the treatment extends also to the unbounded case. □ 

Remark 3.15. A more general definition (see e.g. [STj), of which Definition 
13.131 is a special case, is often used in quantum probability. Unlike our definition, 
which is motivated by statistical inference and filtering, the more general "conditional 
expectation" allows for conditioning on noncommutative algebras and does not have a 
direct statistical interpretation. The more general definition is used e.g. in the theory 
of noncommutative Markov processes We will not dwell on this further. □ 

Theorem 3.16. The conditional expectation of Definition l3~13l exists and is 
unique with probability one (any two versions P and Q ofP(B\si) satisfy \\P — Q\\p = 
0, where ||X||| = P(X*X).) Moreover, F(B\si) is the least mean square estimate of 
B given si in the sense that \\B - P(B\s/)\\ P < \\B - A\\ r for all A £ si. 

Proof. 

(i) Existence. We have already established that for self-adjoint B £ si 1 , wc 
can explicitly define a F(B\si) that satisfies the conditions of Definition 13. 131 using 
the spectral theorem. The classical conditional expectation exists, and moreover the 
conditional expectation of a bounded random variable is bounded. Hence F(B\si) 
exists in s/ for self-adjoint B £ s/' . But any B £ sf' can be written as B = B\ + iB 2 
with self-adjoint B 1 = [B + B*)/2 and B 2 = i(B*-B)/2. As F(B x \si) and V(B 2 \sf) 
exist and F(B\si) = F(Bi\sf) + i¥(B 2 \s/) satisfies the conditions of Definition 15331 
existence is proved. 

(ii) Uniqueness w.p. one. Define the pre-inner product (X, Y) = F(X*Y) on si' 
(it might have nontrivial kernel). Then (A, B-F(B\s/)) = F(A* B) -F(A*F(B\sf)) = 
for all A £ s/ and B £ si', i.e. B - F(B\s/) is orthogonal to si. Now let P and 
Q be two versions of F(B\sf). It follows that (A, P - Q) =0 for all A £ si. But 
P - Q £ si, so (P - Q, P ~ Q) = ||P - Q||2 = 0. 

(iii) Least squares. Let P be a version of F(B\s/). Then for all K £ si 

\\B - K\\l = \\B-P + P- K\\l = \\B - P\\l + ||P - K\\l > \\B - P\\l 

where, in the second step, we used that (B - F(B\si)) _L (F(B\si) - K) £ si . □ 
Remark 3.17. The usual elementary properties of classical conditional ex- 
pectations and their proofs |61| carry over directly. In particular, we have linearity, 
positivity, invariance of the state P(P(P|^)) = F(B), invariance of si (P(P|^) = B 
if B £ si), the tower property F(P(B\si)\tf) = F(B\<g) if C si, the module prop- 
erty F(AB\^) = BF(A\^) for B £ c ta , etc. As an example, let us prove linearity. It 
suffices to show that Z = aF(A\tf) + /3F(B\^) satisfies F(ZC) = F((aA + f3B)C) for 
all C £ c <o . But this is immediate from the linearity of P and Definition 13. 131 □ 

3.4. The Bayes formula. In SJ2I we were able to calculate conditional expec- 
tations explicitly as all algebras were finite-dimensional. In most physical situations, 
however, at least the probe (and often the system as well) admits continuous observ- 
ables and therefore we must deal with infinite-dimensional algebras. In this case it is 
usually not so simple to calculate the conditional expectations directly; however, the 
following Bayes-type formula will be of considerable assistance. 

Lemma 3.18 (Bayes formula |18|~). Let c € be a commutative von Neumann algebra 
and let c €' be equipped with a normal state P. Choose V £ such that V*V > and 
F(V*V) = 1. Then we can define a new state on c £' by Q{A) = F{V*AV) and 

F(V*XV\tf) 



x £ <r. 
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Proof. Let K be an clement of . For all X £ we can write 

F(F(V*XV\^)K) = F(V*XKV) = Q(XK) = Q(Q(X\tf)K) 

= F(V*VQ(X\^)K) = F(F(V*VQ(X\^)K\^)) = P(F(V*V\tf)Q(X\tf)K). 

As this holds for all K € ff, and as by construction the conditional expectations 
are elements of tf, we conclude that \\P(V* XV\tf) - P(y*y|*^)Q(X|^)|| P = 0, or 
equivalents P{V*XV\tf) = P(V*V\tf)Q(X\tf) P-a.s. □ 

We now have sufficient tools to deal with the Stern-Gerlach experiment described 
in |2| Though the following example is not of much practical importance, it demon- 
strates the use of the Bayes theorem in a concrete setting. We will use a very similar 
"reference probability method" to obtain filtering equations later on. 

Example 3.19. (Stern-Gerlach experiment). Consider an atom with two degrees 
of freedom: a spin degree of freedom jV^ = S§{C 2 ) carrying the observables cr x , a z 
etc., and a single spatial degree of freedom jV x — £$(£ 2 (N)) with the affiliated position 
q and momentum p observables defined 6 in Example l3.11l (we use the notations of that 
example). The total algebra describing the atom is then jV = jY„ <X> jV x . Initially 
the spin and position/momentum of the atom are uncorrelated; hence we work with 
the state P = P M ® Pq, where P M is an arbitrary spin state and Po(AT) = (ipQ, Xi/j ) = 
(e(0), Xe(0)}. The latter implies that initially I®q and I®p (which we will interpret 
as position and momentum in the z-direction) have zero mean and unit variance. 

To measure the spin, we apply a magnetic field gradient that is linear in q for some 
fixed period of time. The resulting force on the particle will cause its momentum to 
change; an observation of the momentum of the particle after the interaction should 
thus provide a measurement of its spin o z . In other words, the atomic spatial degree 
of freedom acts as a probe for the atomic spin degree of freedom. The action of the 
magnetic field is described by the unitary 7 

U = cxp(ina z ® q) = P zA <g) e tKq + P z _i <g> e~ lKq = P zA <g> W tK + P z _i ® W- lK 

where K £ R is the field gradient. Let us thus begin by calculating the characteristic 
function of U*(I <8>p)U, the momentum of the atom after the interaction: 

F(e iku ' ( - I ®rt u )=F(U*(I® W- k )U) =F fl (P Z!l )F x (W- iK W- k W iK ) 

+ P„(P,,_i) F x (W iK W- k W- iK ) = P M (P,,i) e 2 - fc - fc2 /2 + p„(p, j _ 1 ) e -2«fe-fe 2 /2. 

Hence the momentum of the atom after the interaction is distributed as a sum of two 
Gaussians of unit variance and means 2k and —2k, which are weighted respectively by 
P/j(Ps,i) and P M (P Zj -i). Note that we cannot perfectly resolve the spin-up and down 
states using a Stern-Gerlach measurement; as the tails of the two Gaussians overlap, 
there is always a nonzero probability that we assign the wrong spin to the atom by 
looking e.g. at the sign of the observed momentum. However, the error probability 
becomes very small when the gradient k is large. 

6 We saw in Remark 13. 1 21 that this description is *-isomorphic to the usual definition of position 
and momentum up to some numerical constants. These are not of essence, however, as they just 
correspond to a change of units in which we measure position and momentum. A little more care 
must be taken if we wish to make quantitative predictions on the outcomes of actual experiments; 
we will not worry about this, however, and work in arbitrary units. 

7 This is the solution of Eq. 12.61 at some fixed time t for a suitable interaction Hamiltonian H . 
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After the interaction, we may want to measure a spin observable a € jVu that 
does not necessarily commute with a z (e.g. cr x ). To describe this, let us calculate 
P(J7*(er <g> I)U\vN(U*(I ®p)U)), the conditional expectation of the spin observable a 
after the interaction given our observation of the momentum of the atom. 

We begin by using the following elementary property: if U is a unitary operator 
and we define the state Q(X) = P(U*XU), then P(U*XU\U* C <?U) = U*Q(X\^)U 
(this can be verified directly using Definition ^. Thus we obtain 

F(U*(a®I)U\vN(y*(I®p)U)) = U*Q(a® I\vN(I®p))U. 

We would like to apply the Bayes rule to <Q(a ® I\vN(I ®p)). As U docs not commute 
with I ®p, however, the Bayes rule does not apply in this form. 

Fortunately we can circumvent this problem using the following trick. Using the 
Bakcr-Campbcll-Hausdorff formula, we can rewrite e lKq as 

inq iK(a+a*) — k 2 /2 ina* ina 



Beware that the Baker-Campbcll-Hausdorff formula technically only holds for expo- 
nentials of bounded operators; thus here and below there will be domain issues, but 
these can be resolved with suitable care. As a i/jq = 0, we can write 

e««Vo = e- K2/ V KQ V KC Vo = e- R2/2 e ZKa '^ = e^' ''/ 2 e iKa ' = eT K ~ \ Kp ip . 
We obtain 

P (e- 1Kq Xe tKq ) = (e IK Vo,^e iK9 Vo) = e" 2 ^ (e Kp Vo, Xe Kp ^ ) = e^f^e^Xe^). 
It follows that we can equivalently replace U by V: 

Q(X) = P(U*XU) = F(V*XV), V = e - K2 e Ka *® p = e~ K% {P zA ®e Kp + P z ^ 1 ®e~ Kp ). 
V is not unitary, but it does commute with I ® p. Hence the Bayes rule gives 

U*F(V*(<7® I)V\vN(I ®p))U 



F{U*{a®I)U\vN(U*(I®p)U)) 



U*¥(V*V\vN(I ®p))U 



We can now use the module property and independence of a ® I and I ®p under P to 
calculate explicitly the numerator and denominator; elementary manipulations give 

V[U*(a®I)U\vN{U*(I®p)U)) = 

By definition F(U*(a®I)U\vN(U*(I®p)U)) is affiliated to vN(U*(I®p)U), and indeed 
the expression above is simply a function of U*(I®p)U. If we observe U*(I®p)U and 
obtain the value p, then the spectral theorem tells us that the conditional expectation 
takes the value given by the expression above if we simply substitute p for U*{I®p)U . 
Note that the formula is not equivalent to the one given by the projection postulate for 
a measurement of a z . For large k, however, we obtain approximately the projection 
postulate expression, and this becomes exact as k — -> oo. □ 
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4. Stochastic processes and quantum Ito calculus. After a general intro- 
duction to quantum probability, we now turn to one particular quantum probability 
space which we will use throughout the remainder of the article. In <|S]wc shall ar- 
gue that this model appropriately describes the quantum electromagnetic held and 
its interaction with matter. In the laboratory, the electromagnetic field can be mea- 
sured by devices like photodetectors which can produce an electric current or even a 
discrete photocount. The statistics of data records from such experiments are well 
approximated by the model considered here. The model is rich and we will discover 
that it contains many interesting classical stochastic processes, i.e. a whole family of 
Poisson and Wiener processes. However, these processes do not commute with each 
other. An extension of the Ito calculus, due to Hudson and Parthasarathy 01], unites 
all these processes in one noncommutative stochastic calculus. 

4.1. Poisson processes on Fock space. The theory we are about to discuss 
can be approached from many sides; here we have chosen to get started by finding 
a quantum probability space that naturally admits a Poisson process, and build the 
theory from there. As we have a particular classical process in mind, the general theory 
gives a hint as to how we could proceed. First, we define the process on a classical 
space (Q, T, P); equivalently, we can form the algebra = L°°(f2, J 7 , P) acting on 
H = L 2 (£l, T 1 P) by pointwise multiplication, with a suitable state P, and represent the 
process as a family of observables affiliated to si ' . To create a noncommutative model, 
we could now broaden our horizon and consider jV = (^(H),P) rather than just si . 
Obviously such a construction does not necessarily carry a physical interpretation; 
this must be considered separately, see For the time being, however, we will use 
this convenient construction to provide us with a rich quantum stochastic model. The 
following discussion is heavily inspired by the work of Maassen ■ 

Consider a classical Poisson process on a finite time interval [0, T]. We wish to 
describe the space of paths O. This is not difficult; a Poisson process on a finite time 
interval has (a.s.) finitely many jumps n. Hence we can specify every relevant path 
by specifying its jump times. Let us thus introduce 

oo 

ft=|JO n , n o = {0}, f2„ = {{t u ...,t n } : h <t 2 < ... < t n G [0,T]}. (4.1) 

n=0 

In other words, Q is the set of ordered sequences in [0,T] with a finite number of 
elements. We still need to introduce a c-algebra T and a measure P. To this end, 
consider S7 n as a subset of the cube ([0, T] n , e~ T fj, n ) where [i n is the Lebesgue measure, 
so that Q n inherits a cr-algcbra T n and a measure P„ from the cube. Under P„ the 
jump times t%, . . . ,t n are uniformly distributed (as must be the case for a Poisson 
process with fixed rate) and P n (fl n ) = T n e~ T jn\. The measure P induced on Q is 
precisely the probability measure of a Poisson process with unit rate. 

We now introduce the Hilbert space F = L 2 (il, J 7 , P). It is called the symmetric 
or Boson Fock space, and plays a central role in the following. We will also need 
the spaces F t i, Fr t and Fr s>t n , defined identically to F except that the interval [0, T] 
is replaced by [0, t], [t,T] and [s,t], respectively. It is not difficult to see that for 
any < s < t < T we have 8 f2 = f2 s i x Or Sit i x Ctu, and as the Poisson process has 



8 A more precise statement would be something like Q = f! s ] X £1( s ,t] X £1( t ; however, the only 
paths for which this makes a difference are those that have jumps exactly at times s or t, which is a 
set of P-mcasure zero. For notational simplicity, we are free to always use closed time intervals [s, t]. 
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independent increments the measure splits up similarly. It follows that 

F = F s ] (g) F[ S)t ] (g) F[i V0<s<t<T. (4.2) 

This important property is known as a continuous tensor product structure; it will 
play a key role in the definition of quantum stochastic integrals, as it gives a natural 
notion of adaptedness. Indeed, the algebra W = splits up accordingly, 

W = W s] <g> W [sA ®W [t = «(F a] ) ® #(F [8jt] ) <g> #(F [t ). (4.3) 

A process of operators {X t } affiliated to W is said to be adapted if X t is affiliated to 
W t ] for every t; equivalently, X t is of the form X t ] ® / as an operator on F t ] <g> F[ t . 

Next, let us introduce a set of interesting vectors. The reader should keep in mind 
Example 13 . 1 II which is conceptually quite similar. Let / £ L°°([0,T]) be a complex 
Lebesgue measurable function. Then we can define the exponential vector 

e(/)(0)=l, e(/)(r)=n/(*)» / <E L°°([0, T]). (4.4) 

ter 

It is not difficult to verify that e(/) £ F, as 

(e(ff),e(/)) = X!— / =exp / 

n=0 "* V J0 / l/° 



T (fl*(t)/(t)-l)^ 



hence (e(/),e(/)) = e H / Hi- T < oo for any / G L°°([0,T]). We define D, the expo- 
nential domain, as the linear span of all e(f), f £ L°°([0,T]), and we note that D is 
dense in F. The exponential vectors have the important property that they factorize 
over the continuous tensor product structure (|4.2|l : indeed, it is evident from (|4.4J) 
that e(/) = e(/ s ]) (g> e(/[ Sit j) ® e(/[ t ) where / t ] is the restriction of / to [0,t], etc. 

We are now ready to define a Poisson process. Let us first define it as a random 
variable on f2; we simply write iV t (r) = |r n [0,t]|, where |r| denotes the number of 
elements in the set r £ ft. The random variable N t counts the number of jumps up to 
time t, and hence {N t } is by construction a Poisson process with unit rate under the 
measure P. We now turn this into an operator process by pointwisc multiplication: 

(A^)(r)=iV t (r)V(r) = |rn[0,t]|V(r), ij> £ F, r £ ft, t £ [0, T]. (4.5) 

{A t } is called the gauge process; it is not difficult to see that though A t is an unbounded 
operator 9 , it is affiliated to W t \ and hence the gauge process is adapted; in fact, the 
increments Nt — N s are even affiliated to ^[ s ,t]- Furthermore, A s and A t commute for 
all s,t £ [0,T], and indeed vN(A f , t £ [0,T]) = L°°( l [l,J r ,'P) C W is commutative. 
Hence we could use the spectral theorem to map A t back to a classical stochastic 
process. It is somewhat futile to diagonalize the operators using the spectral theorem, 
however, as we have already constructed them in diagonal form. 

We have yet to introduce a state; a particularly interesting class of states are 
the coherent statesPf(X) = (e(/), X e(/)) e T ~^^ . Because of the continuous tensor 
product property, the coherent states split up as follows: 

X = X s ] (g) X [sA ® X [t , P f (X) = P fs] (X s] ) P /[s , t] (X M ) P/ [t (X [t ). (4.6) 



9 As can be verified by explicit computation, the domain of At contains at least D, the exponential 
domain. The reader may ask himself why we have only defined exponential vectors e(/) for / £ 



-L°°([0,T]) rather than / £ L ([0, T]): this is because the latter may not be in the domain of A 
Our domain D is sometimes called the restricted exponential domain in the literature. 
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But as N t — N s is affiliated to ^[ s ,t]j it follows that under the state Py the gauge 
process has independent increments. Furthermore, if we denote by P/Vt-iv, the 
spectral measure of N t — N s , then we have 

^ e -i:\f(r)\ 2 dr / ,t 

Evidently, A t is an inhomogeneous Poisson process with rate |/(i)| 2 under the state 
Pf. Note in particular that as e(l)(r) = 1, we have for any X £ L°°(fi,jF,P) the 
relation Pi (AT) = (1,X1) = Ep(X); hence the fact that under Pi the gauge process 
is a Poisson process with unit rate is exactly what we expect from the definition of P. 
Under Pq, on the other hand, the gauge process doesn't register any counts; Po = <j> 
is called the vacuum state, and e(0) = $ is called the vacuum vector. 

4.2. Weyl operators and Wiener processes. We have now exhausted the 
diagonal observables affiliated to the space (L°°(f2, T, P), P/): every such observable 
is some functional of the Poisson process A t with rate |/| 2 . Let us thus explore whether 
we can find interesting observables affiliated to W that do not commute with A t . To 
this end, we follow again essentially Examplc l3.11l Given /, g £ i°°([0, T}) we look for 
a unitary operator W(f) that implements the translation group W(f)e(g) oc e(f + g). 
A calculation identical to the one in Example 13.111 shows that we should define 

W(f)e(g) = e" /o T (/*(%(*)+!/* W/«)* e (/ + g) = e -<A*> 3 -ll/llS/a e(/ + g y (4J) 

The unitary operator W(f) is called a Weyl operator, and provides a projective unitary 
representation in the sense that W(f)W(g) = W(f + g) e ilm ^Jh . Note that it is 
sufficient to define the action of W(f) only on exponential vectors; we can extend 
to D by linearity, and as D is dense and W(f) is bounded the Weyl operators are 
uniquely extended to all of F. An important property, which follows immediately 
from the definition of W(f) and the continuous tensor product property, is that 

W(f)e(g) = W(f s] )e(g s] ) ® W(/ M )e(<7 M ) ® W(f [t )e(g [t ). (4.8) 

In particular, we see that W(/X[o,tl) i s an adapted operator process. 

Now fix / £ L°°([0, T]) and consider the unitary group {W(tf)} te p^; this group is 
in fact continuous (53), and hence by Stone's Theorem I3.1UI there exists a self-adjoint 
B(f) such that W(kf) = e lkB Vl The operators B(f), f £ L°°([0,T]), are called field 
operators. Finding the distribution of the observable B(f) is straightforward, as the 
characteristic function of B{f) (under the coherent state P s ) is given by 

b f (k)=P g (W kf ) = {e{g),e{g + kf))e T -^\\l-Hf, 9 h-k 2 \\f\\l/2 = e 2ifei I n( s ,/> 2 -fe 2 ||/||l/2_ 

Hence B(f) is a Gaussian random variable with mean 2Im(<7, f)^ and variance ||/|||. 
In the vacuum, i.e. g = 0, the mean vanishes; for simplicity, we will restrict ourselves 
to the vacuum case in the following. 

Consider the operator process {Bf = B(e llp X[o.t]) '■ t € [0,T]} for some fixed, 
real function (p £ L°°([0,T]). Bf is adapted, as we have already established that 
W(fx[o,t]) is adapted for any /; moreover, B{e %v X[s,t]) = Bf — Bf is affiliated to 
W[ s ,t] due to Eq. I|4.8|l . This immediately tells us two important things: first, Bf and 
Bf commute for all s,t £ [0, T]; indeed, Bf — Bf must commute with Bf — Bf , 
and commutativity follows from Bf = I. This means that v~N(Bf, t £ [0, T]) is a 
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commutative algebra and hence we can represent Bf for every t as a classical random 
variable on the same probability space (n v ,!F v , P v ); in particular. t(Bf) is a classical 
stochastic process. Second, Eq. <|4.6[1 implies that the process Bf has independent 
increments. But we have established Bf — Bf is (in the vacuum) a mean zero Gaussian 
random variable with variance t — s, and as Bf has independent increments we have 
established that i(Bf) is precisely a Wiener process on (ft v , T v , P v ). 

Let us introduce the following notation. Define Qt = B(*X[o,t])i Pt = B(— X[a,t]), 
and A t = (Qt + iPt)/2. Note that Qt and Pt are self-adjoint by Stone's theorem, 
whereas A t has the adjoint = (Q t — iP t )/2. We now compute 



B(f)e(g) = ±±W(kf)e(g) 



= i{f, .9)2 e(g) - i -^e(g + kf) 



fc=0 

Evidently A t e(g) — (X[o,t]) 5)2 e (<?) — Jo g( s )ds e(g). But then we can write 

(A t e(g)){r) = [ g(s) dsl[[g(r) = f g(s)l[[g(r) ds = f e(g)(r U {s}) ds. 
Jo r£r Jo r£r Jo 

In particular, this formula extends to any tf> € F for which the integral on the righthand 
side (with e(g) replaced by ijj) defines a normalizable vector. A t is called the Fock 
space annihilation operator, as it generalizes the corresponding notion introduced in 
Example 13. Ill The reader should verify that its adjoint can be expressed as 

(a*vo(t)= J2 

sGrn[0,t] 

on a sufficiently large domain. Not surprisingly, A^ is called the creation operator. It 
is conventional in quantum stochastic calculus to use A t and its adjoint rather than 
Qt and Pt] we shall conform to this standard. 

In summary, we have constructed a quantum probability space (W, </)) that admits 
an entire family (indexed by ip) of Wiener processes. Note however, that these pro- 
cesses do not necessarily commute for different ip; in fact, it is not difficult to establish 
that [#(/), B(g)]ijj = 2ilm(f, 3)2 ip on a suitably large domain (e.g. ijj £ D). There- 
fore, even though every Bf defines a Wiener process, these cannot be represented on 
the same classical probability space for different y?x,2 unless Im(e^ ¥ ' 1_ip2 ' ) ) = 0. 

We have also defined a Poisson process A t , but unfortunately it vanishes in the 
vacuum. Consider, however, the process A t (/) = W(f)* A t W(f); for any Borel func- 
tion b we can write 0(6(A tl (/),..., A t „ (/))) = P/(6(A tlJ . . . , A t J). Evidently A*(/) 
has the same statistics in the vacuum as does A t under the coherent state P/. This 
shows that we can define even a whole family of Poisson processes in the vacuum. We 
do not lose much by restricting ourselves to the vacuum as an underlying state (as we 
will do in the remainder of the article), as we can always transform to a coherent state 
by "sandwiching" with Weyl operators. Note that like the family Bf, the processes 
At(/) do not commute amongst each other. We see that the quantum probability 
space (W , 4>) gives rise to a rich family of incompatible stochastic processes. 

4.3. Quantum stochastic calculus. Now that we have obtained Wiener and 
Poisson processes, we can try to develop stochastic integrals with respect to these 
processes and an associated stochastic calculus. Note that if we were only interested 
in, e.g., integrating with respect to Qt an adapted process which commutes with Q t , 
then we could simply use the classical Ito integral definition through the spectral 
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theorem. This will not suffice for our purposes, however, as we will want to consider 
stochastic differential equations that are driven simultaneously by the noncommuting 
noises Qt and Pt (and even A t ). Moreover, we would like to have an ltd rule that tells 
us how to multiply stochastic integrals with respect to Qt and Pt. 

Our motivation for developing generalized quantum stochastic calculus is that this 
allows us to rigorously define and manipulate Schrodinger equations, as in Eq. (|2.6|l . 
with a white-noise Hamiltonian formally defined by H (t) = Ho + HiQ t + H2 Pt- In 
we will see that such models emerge naturally in applications. In this section we sketch 
the development of quantum stochastic calculus as it was introduced in a seminal 
paper by Hudson and Parthasarathy |41| . For a full development of this calculus we 
refer to [411 1401 151] , The Hudson-Parthasarathy approach has some technical issues, 
not surprisingly involving the unboundedness of operators, the full extent of which is 
still being explored. Though we cannot go into detail here, we will attempt to sketch 
some of the issues and give references to recent literature. 

We work in the following setting. We wish to integrate processes against the 
three noises A t , A* t and A t (the fundamental noises), i.e. we want to define J L s dM s 
where M t is one of the fundamental noises. The noises are defined on the quantum 
probability space but we will want to couple these noises to an external 

quantum system, the initial system 10 , with which they interact. To this end, let us 
introduce the initial Hilbert space h, 3$ = £$(h) and the associated initial quantum 
probability space {.S3, p) . We will choose our integrands L t to be adapted processes 
on {3$ ® "W ', p (g) </>), i.e. each L t is affiliated to 3$ <g> W t \ and acts as I on #[ t . 

As usual, we begin with simple processes. Given s < t, recall that for the funda- 
mental processes M t — M s is affiliated to W[ s ,t]i whereas for adapted processes L s is 
affiliated to Wg\ ; hence we can naturally write L s {M t — M s ) = L s <g) {M t — M s ). In par- 
ticular the increment M t — M s commutes with L s , and we have no problems with op- 
erator multiplication of these unbounded operators. Let {tj, : i = 0, . . . , n, U < t i+ i\ 
be a sequence of times with to = and t„ = T. By definition, we set 

n— 1 pt n—l 

L t = Y, L uX[t t ,u +1 )(t) / L s dM s = J2 L n®(M ti+1 At-M UAt ). 

i=0 ^° i=0 

This definition makes sense as long as the operators L t and M t have a sufficiently 
large common dense domain that the sum is well defined. To enforce this, we will 
require that the domain of every L t contains at least the exponential domain D. 

Now comes the hard part in any integration theory: given a quadruple of suitably 
restricted adapted processes (E,F,G,H), such that these admit simple approxima- 
tions (E n , F n , G n , H n ) , we wish to define the integral 

It = I (E t dA t + F t dA t + G t dA* t + H t dt) (4.9) 
Jo 

as a limit, in some sense, of the corresponding integrals 7™ over the simple processes. 
Recall that in the classical theory, the Ito isometry allows us to define the stochastic 
integral as a mean-square limit of simple processes, and a little more work shows 



10 This name has the following origin. Recall from J^Jthat obscrvablcs X evolve in time as Xt = 
U£XUt (we will define a unitary evolution Ut in JSJ- We would like to think ofX(g)/G^®^as 
describing the external system; however, Uf(X (gi I)Ut will not be of the form Y ® I except at t = 0. 
Hence the initial system observable X £3 / describes the external system at the initial time t = 0. 
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that every squarc-intcgrablc process admits a mean-square approximation by simple 
processes. Things are not quite so "simple" in the noncommutative case, however. 

To see what goes wrong, consider for simplicity the case h = C so that we can 
forget about the initial state p. We already encountered the noncommutative I? 
(semi)norm |jA||^ = <p(X*X) when we discussed conditional expectations. We are 
thus looking for a suitable unbounded operator I t such that we have mean-square 
convergence, ||/ t — I" || ? = ((I t — (h — — > as n — ► oo. But this is a very 

ill-defined problem, as it only depends on the action of It on the vacuum vector <j>; in 
particular, what do we choose as the domain of It, and how do we define It on vectors 
orthogonal to $? There could be a large number of incquivalent ways of doing this, 
giving rise to limiting operators with very different properties 11 . 

The solution of Hudson and Parthasarathy works as follows. First of all, we fix 
the domain of I t at the outset: every stochastic integral will have h (g> D as its domain 
(one could choose a dense domain in h as well; we will not worry about this). To 
specify It as a limit of simple integrals 7", we choose It as the unique operator on 
h <g) D such that ((I t - J t n ) v ® ip, (I t - I") v <g> ip) ^ for every ip G D, v G h (it 
is sufficient to verify this for ip = e(/), / G L°°([0,T])). In essence this is like a 
mean-square limit, but simultaneously for every coherent state. A suitable estimate 
replaces the Ito isometry |41l Corollary 1] and shows that this limit exists as long 
as J Q \\(E S — E™) v (g> ip\\ 2 ds — > as n — > oo for every ip G D, v € h (and similarly 
for F,G,H), independent of the approximation. Finally, |41l Proposition 3.2] shows 

that every square-integrable process, i.e. J \\E S v <8> 2 cis < oo for all ip G D, v G h, 
admits a suitable approximation by simple processes. We thus arrive at the following. 

Definition 4.1 (Quantum Ito integral). An operator process {Xt} is stochasti- 
cally integrable if it is adapted and square-integrable. Given a quadruple (E, F, G, H) 
of such processes, the stochastic integral M-ity is uniquely defined as the limit of simple 
approximations on the domain h (g> D. 

A property that we will exploit in future is A t $ = A t Q — 0. It is immediate 
from the definition that stochastic integrals with respect to A t and At acting on $ 
vanish. Hence the vacuum expectations of stochastic integrals with respect to A t and 
At vanish as well. Furthermore, as (f2, A% fi) = (A t fl, fi) = 0, we see that at least for 
simple processes (and indeed this holds for any integrand) the vacuum expectation of 
stochastic integrals with respect to A% vanish. Note, however, that A* t <& ^ 0. 

Our next task is to develop a stochastic calculus; the integrals defined above 
are not of much use, unless we have an Ito product rule with which they can be 
manipulated. Once again we run into unpleasant problems. If It and Jt are integrals 
of the form (|4.9(l . there is no reason to expect that their product It Jt is a well-defined 
operator on the domain h (g> D. The idea of Hudson and Parthasarathy is inspired by 
the identity (ip',X*Yip) = (Xip',Yip) for bounded operators; rather than finding an 
expression for ItJt, they calculate (It v' Jt ip) for every v G h, ip G D, which is 
always well defined. One finds explicitly a lengthy expression [411 Theorems 4.3-4.4], 
which is essentially the quantum Ito rule expressed in terms of h <g> D-matrix elements. 

In practice, however, we are mostly interested in calculating actual operator prod- 
ucts It Jt- We will need the concept of an adjoint pair; two operators X and X* are 
said to be an adjoint pair if (v' <g> ip', X v ® ip) — (X' v 1 ® ip', v ® ip) for every ugh, 



11 This was not a problem for the definition of conditional expectations; as all versions of the 
conditional expectation are affiliated to a single commutative algebra, they are a.s. equivalent by the 
spectral theorem. On the other hand, various "versions" of It that satisfy \\It — ipWcj, — * need not 
even commute, and such operators are fundamentally incquivalent. 
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ip e D. It is not difficult to verify that if (E, F, G, H) and (E\F\ G f , W) arc adjoint 
pairs, then I t and ij form an adjoint pair, where 

l\ = [ (E\ dA t + F} dA* t + G\ dA t + H\ dt). (4.10) 
Jo 

In essence, the adjoint f replaces the Hilbert space adjoint * on the domain h <g) D. 
Now suppose that we can verify explicitly that the product It Jt is well defined; then 
we can read off an expression for J t J t from the matrix elements (ij v' <£> ip', Jt v ® ip). 
This gives the following explicit form of the quantum ltd rule. 

Theorem 4.2 (Quantum Ito rule |3| Proposition 25.26]). Let (F,G,H,I), 
(B,C,D,E) and {B\C\ D\ E^) be quadruples of stochastically integrable processes 
such that the latter two quadruples are adjoint pairs. Define the stochastic integrals 

dX t = B t dA t + C\ dA t + D t dA* + E t dt, 
dY t = Ft dA t + G t dA t + H t dA* t + I t dt, 

and suppose that we have verified that the product X t Y t is well defined and that 
X t F t , . . . , X t It, B t Y t , . . . , E t Yt, and B t F t , B t Gt, ■ ■ ■ , E t It are well defined and stochas- 
tically integrable. Then the process X t Y t satisfies the relation 

d(X t Y t ) = X t dY t + (dX t ) Y t + dX t dY t , 

where X t dY t = X t F t dA t + X t G t dA t + X t H t dA* t + X t I t dt, (dX t ) Y t = B t Y t dA t + 
C t Y t dA t + D t Y t dA* t + E t Y t dt, and dX t dY t = B t F t dA t + C t F t dA t + B t H t dA* t + 
CtH t dt is evaluated according to the quantum ltd table 



dX \ dY 


dA t 


dA t 


dA; 


dt 


dA t 





dA t 


dt 





dA t 





dA t 


dA; 





dA* 














dt 















In particular, the theorem holds if B t , C t , D t , E t and X t are bounded processes |41j . 
in which case the adjoints B^ etc. are simply taken to be the Hilbert space adjoints 
B* etc., and X t extends uniquely to a bounded operator in W t ]. 

Remark 4.3. The choice to restrict attention to a fixed domain h <g> D allows 
Hudson-Parthasarathy to develop a viable quantum stochastic calculus. This choice, 
however, has quite a few drawbacks; we highlight one of the problems. Suppose X is 
self-adjoint; implicit in this statement is that the domains of X and X* coincide. It 
can happen that if we restrict the domain of X , then the restricted operator admits 
many inequivalent self-adjoint extensions; see [551 pages 257-259] for an example. 
Hence the restriction to a fixed domain can become a real, physical problem, that 
prevents us from uniquely interpreting unbounded operators on h ® D as observables. 

Such problems have prompted the development of alternative approaches to quan- 
tum stochastic integration, and the topic is still under active investigation. In a sig- 
nificant recent achievement Attal and Lindsay, building on several earlier approaches 
(see e.g. [2]^] and the references therein), develop a theory in which the integrals 
achieve their maximal domains [H] ■ Unfortunately, the theory is very technical and a 
little daunting for every-day use. A different approach that even prececds Hudson and 
Parthasarathy is that of Barnett, Streater and Wilde Their theory is attractive 
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as it is completely algebraic in nature (the Hilbcrt space and its domains do not play 
a fundamental role), but lacks a satisfactory ltd rule. 

Despite these issues, the Hudson-Parthasarathy approach works quite well. In 
practice one usually works with a "noisy Schrodingcr equation" Eq. (|5.2|> , the solution 
of which is unitary and thus bounded. As long as the integrals and integrands are 
bounded, they are uniquely defined by their specification on a dense domain. In this 
article, in keeping with our attitude towards unbounded operators, we will not worry 
about such issues and assume that we can apply the quantum Ito rules. □ 

Example 4.4. In SjSJwe will encounter quantum stochastic differential equations 
(QSDE), the treatment of which proceeds along the same lines as the classical theory. 
We claim that the Weyl operator W(/ t j) is the solution of the QSDE 

dW(f t] ) = [f{t)dAl-f{t)dA t - \\f{t)\ 2 dt)w(f t] ). (4.11) 

In particular, one can verify the Weyl relation W(f)W(g) = W(f + g) e lIm ( 9 '^ 2 
directly using the quantum Ito rule. From Eq. (|4. lift and W(kf) = e lkB ^^ we obtain 

B(f)= f \if{tydA t -if(t)dAl). 
Jo 

Hence dBf = ie~ lip ^ dA t — ie lip ^dA^, and the quantum Ito rules reduce to the 
classical Ito rule (dBf) 2 = dt. Finally, recall that we defined Poisson processes 
A t (/) = W(f)*A t W(f) = W(f t] )*A t W(f t] ) (the latter equality is due to W(f) = 
W(/t]) ® W(f[t) and the fact that W(f\t) £ Wh is unitary and commutes with the 
adapted process A t ). Using the quantum Ito rule we obtain the explicit representation 

dA t (f) = dA t + /*(*) dA t + f(t) dA* t + \f(t)\ 2 dt, (4.12) 

for which the quantum Ito rules reduce to the classical product rule (dA t (f)) 2 = 
dA t (f) for a Poisson process. □ 

5. The filtering problem in quantum optics. Many realistic physical sce- 
narios are very well described by quantum stochastic differential equations driven by 
the processes A t , A\ and A t discussed in the previous section. Of course, as in the 
classical theory, white noise systems are only an idealization of physical interactions; 
a Markov limit of wide-band noise in the spirit of Wong and Zakai [251 gives stochas- 
tic models in the Ito form. For a large class of quantum systems, particularly those 
arising in the field of quantum optics, such approximations are extremely good and 
describe laboratory experiments essentially to experimental precision. Though a de- 
tailed discussion of the physics involved in the modelling of such systems is beyond the 
scope of this article, we here very briefly describe the physical origin of the equations 
that are widely used in the physics community |32| . describe the measurements that 
are made, and set up the quantum filtering problem to be solved. 

5.1. The quantum optics model. The basic model of quantum optics consists 
of some fixed physical system, e.g. a collection of atoms, in interaction with the 
electromagnetic field. The atomic observables are self-adjoint operators on a Hilbert 
space h. The description of the electromagnetic field and its interaction with the 
atoms follows from basic physical arguments (see the excellent monograph |21j for a 
thorough treatment of this theory, known as quantum electrodynamics). It turns out 
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that the free electromagnetic field, i.e. an optical field in empty space, is described 
by a stationary Gaussian (noncommutativc) wide band noise a{t, r) that propagates 
through space at the speed of light c; i.e. if we restrict ourselves to a single spatial 
dimension, a(t +t,z) = a(t, z — cr). If we now place the atoms at the origin z = 0, 
then the quantum dynamics is given by a Schrodingcr equation of the form 

-U(t) = [-iH + La*(t,0)-L*a(t,0)}U(t), U(0) = I, (5.1) 

where L £ 33 is an atomic (dipole) operator and H £ 33 is an atomic Hamiltonian, H 
being self-adjoint. This equation, which follows directly from the physical model, has 
wide-band right hand side. Note that we have set H = 1 for convenience, a convention 
ubiquitous in physics (the only consequence is a change of units). 

We now want to approximate the wide-band noise by white noise. This can be 
done in a rigorous way I35| , but we will not detail the procedure here (a brief 
sketch can be found in Suffice it to say that one arrives at the following quantum 

stochastic differential equation (QSDE) 

dU t = {LdA$-L*dA t --L*Ldt-iHdt}u t , U = I, (5.2) 

which is driven by the non-commuting white noise processes A t and A* t . Note that 
this is almost precisely of the same form as Eq. q5.1[) . except that we have added 
the Ito correction term —^L*LUtdt. A Picard iteration argument |^ EH ensures 
existence and uniqueness of the solution. The adjoint C/ t * satisfies 

dUt = U* t \h* dA t - L dA* t - -L*Ldt + iHdtj, C/ * = I. 

Using the quantum Ito rule we can calculate d(U£Ut) = d(UtU£) = 0, i.e. the solution 
Ut is unitary for all t (as the solution of a Schrodinger equation should be). 

Henceforth we will take Eq. I|5.2|l as our physical model. Ut defines the time 
evolution or flow jt : X i— > U^(X (g> I)Ut of every atomic observable X £ 38 (recall the 
time evolution in t|2.1f) ; i.e., an observation of X £ 33 at time t is described by the 
observable X t = jt(X). Using the Ito rules, we find an explicit dynamical equation 

djt{X)=j t {C LtH {X))dt + 3 t {[L*,X])dA t +j t {[X,L])dA* u X £ 33, (5.3) 

where the so-called Lindblad generator 07j is given by 

Cl,h(X) = i[H,X} + L*XL - ^(L*LX + XL*L), X £ 33. 

In quantum probability, this object plays the same role as the infinitesimal generator 
of a Markov diffusion in classical probability theory. 

Remark 5.1. Though it is unusual, one could use a very similar notation 
in classical stochastic models. Suppose some system is described by an underlying 
configuration x t that obeys dx t = b(x t ) dt + o~(x t ) dW t - Then the "observables" in the 
theory, i.e. things we could try to measure, are functions / of the configuration of the 
system. The observable / at time t is described by the random variable jt{f) = f{%t)- 
Using the classical Ito rules, we get djt(f) = jt{Cf) dt + jt(S/) dW t where Cf(x) = 
J2i b l (x)dif(x) + \ J2ij <J l (x)^ (x)didj f (x) is the generator of the Markov diffusion 
Xt, and S/(x) = J2i aZ ( x )^if( x )- This expression is the classical analog of (|5.3|l : the 
sample paths xt do not have a quantum counterpart, however. □ 
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Input field Atoms Output field Detector 



Detector i i 



U t *A t U t Y t = K t (X) = 

A t , A t * jt(X)=U t *XU t U t *A t *U t Ut*(At+At*)U t P(j t (X)\Y s < t ) 

Fig. 5.1. Cartoon of the quantum filtering setup in quantum optics. An optical field, described 
by the field operators At, A*, interacts with a system, e.g. a cloud of atoms. After the atom-field 
interaction the field operators, as well as system operators X , are rotated by the unitary Ut- The 
field is detected, giving rise to the observation Yt- Finally, the quantum filter (implemented on a 
classical signal processor) estimates atomic observables based on the field observations. 



5.2. Measurements. Having described the system and its interaction with the 
field, let us now turn to the observations that we can perform. Unlike in classical 
models, where one observes the system directly (with the addition of some corrupting 
noise), in quantum models an observation is generally performed in the field. From the 
system's perspective, the interaction with the field looks like an (albeit noncommuta- 
tive) noisy driving force. Similarly, however, the field is perturbed by its interaction 
with the atoms, and carries off information as it propagates away after the interaction. 
By performing a measurement in the field, then, we can attempt to perform statistical 
inference of the atomic observables. The entire setup is depicted in Fig. 15.11 

To calculate the perturbation of the field by the atoms we once again calculate 
UfYUt, where now, however, Y is a field observable. The field observable of interest 
depends on the type of measurement we choose to perform. Without entering into the 
details, we mention two types of measurement that are extremely common in quantum 
optics: direct photodetection (photon counting), for which the observation at time t 
is given by Y t A = U£A t U t , and homodyne detection, for which Y t w = U£(A t + A* t )U t 
(more generally Y t w = U t *(e- l<fi A t + e lv A;)U t ). We refer to Ej for a detailed 
treatment of quantum optical measurements. Using the Ito rules we obtain 

dY t A = dA t + j t (L) dA* t + j t (L*) dA t + Jt (L*L) dt, (5.4) 



dY t w = j t (L + L*) dt + dA t + dA* t . (5.5) 

Intuitively, it would appear that Y t A is like a Poisson process whose intensity is con- 
trolled by j t (L*L) (recall Example 14. 4f) . whereas Y^ looks like a noisy observation 
of jt(L + L*). One should be careful with this conclusion, however, as jt(L) need not 
commute with A t or A^, nor with itself at different times. 

It is essential, however, that the observation process commutes with itself at 
different times, and is hence equivalent to a classical stochastic process through the 
spectral theorem. An observation process that does not obey this property cannot be 
observed in a single realization of an experiment and is physically meaningless. Let 
us show that the observations processes we have defined above do obey this property, 
which is called the self-nondemolition property. Let Z be any operator of the form 
/ (g> Z s ] <g> / on h <g) F s ] g) F[ s and let t > s. Then the Ito rules give directly 

u t *zu t = u;zu s + [ u;c L . H (z)u T dT+ f u;[L*,Z]U T dA T + [ u;[Z,L]U T dA* T . 
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Now let Z = A s + A* or Z = A s . In both cases C LM (Z) = [Z, L) = as L and H 
are system observables and Z is a field observable. Hence = U^{A S + A* s )U t and 
Y S A = U^A s Ut for all t > s. It is now easily verified, using the unitarity of Ut and the 
fact that A s + A* s and A s arc commutative processes, that [Y t w , Y s w ] = [Y t A , Y S A ] = 
for all t, s. We denote by *3f™ and ^ A the commutative von Neumann algebras 
generated by the observation processes Y™ and Y A , s < t, respectively. Do note, 
however, that Y^ and Y t A do not commute with each other; in any experiment, we 
can choose to perform only one of these measurements. Once we have made this 
choice, however, we can use the spectral theorem to represent the observations Y t as 
a classical stochastic process i{Y t ) on a probablity space. 

5.3. Statement of the filtering problem. Moving on to the next step in our 
program, we now wish to use the information gained from the measurement process to 
infer something about the system. To find a least mean square estimate of a system 
observable I £ J at time t, given the observations Y t up to this time, we must 
calculate the conditional expectation 

7r t (X)=F(j t (X)\& t ) (5.6) 

where = vN(Y" s : < s < t). The remainder of this article is devoted to find- 
ing a recursive equation for nt[X) (the filtering equation). Recall, however, that the 
conditional expectation is only defined if jt[X) is in the commutant of the inter- 
pretation being that statistical inference of an observable is only physically meaningful 
if the conditional statistics could possibly be tested through a compatible experiment. 
Through an entirely identical procedure to the one used to show the self-nondcmolition 
property, we can show that jt(X) is in the commutant of ^ for any X € 23. This is 
known as the nondemolition property, which can be written as 

[jt(X), Y s ]=0 V s<t, X e 23. (5.7) 

We note that we have now obtained a system-theoretic model of our system and 
observations, defined on the quantum probability space [SB <S> "W ■, P = p <8> 4>) by 

djt[X)=j t [£ L , H [X))dt + jt{[L*,X])dA t +jt[[X,L])dA*, (5.8) 
dY t = j t [L + L*) dt + dA t + dA* (5.9) 

in the case of homodync detection, or by Eq. I|5.8(l and 

dY t = dA t + jt(L) dA; + j t [L*) dA t + jt{L*L) dt (5.10) 

in the case of counting observations. These equations define a system-observation 
model in direct analogy to such models used throughout classical nonlinear filtering 
and stochastic control theory. 

Remark 5.2. Unlike in a classical filtering scenario, we have not added any 
independent corrupting noise to the observations. Nonetheless, the filtering problem 
does not reduce to a problem with complete observations because the system is driven 
by noise that does not commute with the observations. Hence the problem of partial 
observations is intrinsic to quantum measurement theory. The quantum filtering prob- 
lem considered here is the simplest possible one; one could add additional corrupting 
noise as in the classical case, have the system interact with multiple fields (some of 
which are observed, others unobserved), etc. These are not essential complications, 
however, and filters for such models are obtained much in the same way. □ 
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6. The reference probability method. The goal of this section is to derive 
the quantum filtering equation, a recursive equation for ir t (X), using a method that is 
close to the classical reference probability method of Duncan [27J , Mortensen [52] , and 
Zakai [M]. We consider first the homodyne detection case, then the photon counting 
case. In <J7Jwe will rederive the filtering equation for the homodyne detection case 
using martingale methods; the chief advantage of the reference probability method is 
that it is somewhat simpler to apply. The following approach is based on |18j . 

6.1. Homodyne detection. Let us briefly recall the classical reference proba- 
bility procedure; for an introduction see e.g. [23 ■ In order to simplify the filtering 
problem, one starts by introducing a new probability measure, using a Girsanov trans- 
formation, under which the measurement record is a Wiener process. Then various 
(elementary) properties of the conditional expectation allow the filtering problem to 
be expressed, and solved, with respect to the new measure. We now apply this logic 
to the quantum filtering problem. Note that we have already applied the method in 
Example 13. 191 the following is essentially a continuous time version of that example. 

We consider the homodyne detection setup given by Eqs. (|5.8|) and l|5.9|) . We 
could try to find a new state under which Y t is a Wiener process; however, it will be 
more convenient to work not in terms of Y t but in terms of Z t = A t + A%, as it is very 
easy to manipulate Z t using the methods of Thus before we really start filtering, 
let us transform the problem in terms of Z t . Introduce the state Q* defined by 

Q t (X)=¥(U*XU t ), (6.1) 

with Ut as in <JS] an d we nx from now on P = p ® cf>. Now recall from Example 
EHthat Q(X) = ¥(U*XU) implies V(U* XU\U*tfU) = U*Q(X\tf)U (this is easily 
checked using the definition of the conditional expectation) . Thus we have 

P(j t (X)\& t ) = U:® t (X\%)U t , xeJ (6.2) 

where % = vN(Z s : < s < t). Note that ^ = U^ t U t follows from the fact that 
U*Z S U S = U^Z s U t for t > s, the property we used in t|5.2l to prove self-nondemolition 
of Yt. The ease with which we will now be able to manipulate Q t (X\'^'t) highlights 
the usefulness of the transformation l|6.2|l . 

Our strategy will be as follows. We wish to calculate Q'(X|^ t ); however, the 
state P has the nice property that Z s < t , which generates is a P- Wiener process. 
We want to use the Bayes formula, Lemma 13.181 in order to express Q t (X\%) in 
terms of P-conditional expectations. We run into a problem, however, as the "change 
of measure" operator Ut that relates P with Q* does not satisfy the requirement of 
Lemma f3 . 1 81 that 12 Ut £ To solve this problem, we will replace Ut by a different 
operator V t which is affiliated to but which still defines the same state in the sense 
that P(UtXUt) = ¥{V t *XV t ) for every X. The following technique, to our knowledge, 
first appeared in |38| : it replaces Girsanov's theorem in the quantum context. 

Lemma 6.1. Let V t be the solution of the QSDE 

dV t = |l (dA* t + dA t ) - ^L*Ldt-iHdt}v t . (6.3) 

Then V t is affiliated to tf{ and Q\X) = P(V t *XV t ) for all X £S8®W. 

12 If this were the case then we could calculate Yt = U% ZtUt = ZtU'Ut = Zt, i.e. the observations 
would carry no information about the system and the filtering problem would be trivial. 
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A precise proof of this statement is not very insightful, see e.g. [TQ or ^21- How- 
ever, it is not difficult to see why the statement should be true. Let us assume for 
simplicity that the state p on S$ is pure; we can always obtain a mixed state later by 
taking convex combinations. Then P(X) = (ip eg) 3>, X ip (g> 3>) for some vector ip £ h 
(and $ £ F is the vacuum vector). To show that f(U^XU t ) = V(V t *XV t ), it is thus 
sufficient to show that Utip<8>& = Vtip®&. Now recall from [01 that any stochastic 
integral with respect to At vanishes when it acts on the vacuum vector; hence 



U t ip®§ = 



I 



LU.dA* - / L*U,dA, - 



-L*L 
2 



iff U„ ds 



ip $ 



LU H dA* - 



2 



L + iH U s ds 



and similarly we obtain for Vt acting on the vacuum 

1 



V t ip <g> $ = 



I 



LV S dAl 



L*L + iH )V S ds 



ip(&$. 



But as these expressions are the same, they should have the same solution Utip<E)$ = 
Vtip <8> 3>- In principle, we could change the integrand of the A t -integral arbitrarily 
without affecting how the QSDE acts on the vacuum; in Lemma |6. II wc exploit this 
fact to modify XJ t precisely so that it is in the commutant of indeed, Eq. (j6.3() is 
driven only by the noise Z t = A t + A* t and its coefficients are in 38 C ^*/. 

We are now ready to apply the Bayes formula, Lemma l3.18l Together with Lemma 
16. II and Eq. 1|6.2[) . we immediately obtain the following result. 

Theorem 6.2 (Noncommutative Kallianpur-Stricbcl). Define for any system 
operator X £ £8 the unnormalized conditional expectation 



a t (X) = U t *F(V t *XV t \%) U t £ m- 
Then the conditional expectation Ij5.6|l is given by 

MX) 



MX) 



Mi) ' 



VI £ 



(6.4) 



(6.5) 



We now obtain an explicit expression for o~t(X). 

Theorem 6.3 (Unnormalized quantum filtering equation). The unnormalized 
conditional expectation o~t(X) satisfies the following linear QSDE: 



dat(X) = a t (C L , H (X)) dt + a t (L*X + XL) dY t . 



(6.6) 



To obtain l|6.6[) we will need to take conditional expectations of quantum Ito in- 
tegrals. Let us briefly show how to do this. First, we claim that if Kt is an adapted 
process with K s affiliated to %' s , then V(K a \%) = P(K S \%) for s < t. This follows 
from the fact that = 1f 3 (g) % s ,t] an d that K s is independent from ^r a t] by adapt- 
edness. Second, conditional expectations and integrals can be exchanged as follows: 



K* ds 



F(K s K)ds, P 



K „ dZ f 



V(K s \tf s )dZ s 



These properties are immediate if K t is a simple process, and a proof of the general 
case is not difficult. 
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Proof. Using the quantum Ito rules wc have 

V t *XV t = X+ f v;C L , H {X)V s ds+ f V s *(L*X + XL)V a d(A s + A* s ). 
Jo Jo 

Wc next take conditional expectations of the terms in this expression; we obtain 

P(V t *XV t \%)=P(X)+ f p(v:c L , H (x)v s \%)ds 
Jo 

+ f P(V:(L*X + XL)V s \%)d(A s +A* s ). 
Jo 

Another application of the quantum Ito rules now yields 16.61) . □ 

By applying the Ito rules to the noncommutative Kallianpur-Striebel formula 
(|6.5() . we obtain an expression for the normalized conditional state 

dn{X) =n(C L ,H{X))dt+^n{L*X+XL)-nt(L*+L)nt(X)^dY t -Trt(L*+L)dty 

(6.7) 

This (normalized) quantum filtering equation is a quantum analog of the classical 
Kushner-Stratonovich equation of nonlinear filtering. Note that this is a classical 
stochastic differential equation by the spectral theorem: it is a recursive equation that 
is only driven by the (commutative) observations Y t . Hence it can be implemented 
on a classical (digital) signal processor, as depicted in Fig. 15.11 

Remark 6.4. Eq. I|6.7|l is expressed in terms of the conditional state ir t (X), 
where X € Now recall from S|21 that any state on a finite-dimensional Hilbcrt 
space can be expressed as Tr[pX] for some density matrix p. Similarly, if h (and 
hence is finite-dimensional, then we can always write ^t(X) = Tr[p t X] where pt, 
the conditional density matrix, is a (random) density matrix that is a function of the 
observations up to time t. From Eq. (|6.7(l we obtain explicitly 

d Pt = -i[H, p t ]dt+{Lp t L*-\L*Lp t -\p t L*L) dt+(Lp t +p t L*-Tr[{L+L*)p t ]p t ) dW t 

where dWt — dY t — Tr[(L + L*)p t ] dt. In J7|we will see that Wt is a Wiener process. 
It is this representation that is usually found in the physics literature. □ 

6.2. Photon counting measurements. We now consider the photon counting 
setup given by Eqs. Ij5.8|l and H5.10JI . We would like to follow the same procedure as 
for homodyne detection. The following lemma, which replaces Lemma |6. II suggests 
how to proceed. The proof is identical to that of Lemma T6. II 

Lemma 6.5. Let XJ[ be the solution of the QSDE 

dU' t = |l' dA\ - L'* dA t - \l'*L' dt - iH 1 dt}u' t 

and let V{ be the solution of 

dVt = ^L'(dA t + dA* t + dA t + dt) - ^L'*L' dt -l! dt- iH' dtjvj. 

Then V/ is affiliated to vN(A s + A* + A s + s : s < t)' and ¥{U' t *XU' t ) = P(V t '*XV t '). 

Define Z t = A t + A; + A t +t and % = vN{Z s :0<s<t). Lemma 1631 directly 
provides us with a nondcmolition change of measure, provided that we rotate our 
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problem so that ^ = U't^tV't using a suitable unitary operator U[. Then, defining 
a t (X) = Ui*P(V t '*XV^\%) U[, the Kallianpur-Striebel formula holds for <x t (X). 
Define Rt as the solution of the QSDE 

dR t = {dA t - dA* t - \dt) R t 

Recall Example 14. 41 evidently R t is a Weyl operator, and in particular A t = R\Z t Rt- 
But recall that Y t = U£A t U t = f/ t * Rt Z t R t U t ; thus U[ = R t U t is our rotation of choice. 
Using the quantum Ito rules we obtain 

dU't = {(£ - 1) dA* t - (L* - 1) dA t - ^{L*L + I - 2L + 2iH) dt}u' t , 
which corresponds to the nondemolition change of measure 

dV( = |(L - 1) dZ t - \{L*L -1 + 2iH) dt}v t '. 
For X G 3$, we obtain using the quantum Ito rules 

dV t '*XV; = Vf ' {C l ,h{X))VI dt + V^(L*XL - X)V{ (dZ t - dt). 
Finally we obtain using the definition of at and the quantum Ito rules 

da t (X) = at(C L , H (X)) dt + (a t (L*XL) - a t (X)) (dY t - dt) . 

which is the unnormalizcd quantum filtering equation for counting observations. 

Using the Kallianpur-Striebel formula TTt(X) = at(X) / ot{I) we can now obtain 
an expression for the normalized conditional state 

dn t (X) = n t (C L , H (X))dt + ~ n{X)j (dY t - n (L* L) dt) , 

which is the normalized quantum filtering equation for photon counting. 

7. The innovations method. In this section we rederive the filtering equation 
for homodyne detection, Eq. 1)6.7(1 . using martingale methods that are analogous to 
the classical case E| ■ We follow the classical treatment as in , |2S1 chapter 
18], [SHI chapter 7]. Martingale methods have enjoyed wide and successful application 
in classical stochastic theory. The procedure is less straightforward than the reference 
probability method, however, and some familiarity with classical filtering theory would 
be helpful (see e.g. |2U for an excellent introduction). 

Let £t,/3t, At,/Zt be adapted processes affiliated to where 

& = I3 s ds + m t =£o + f (3 s ds+ [ (A s dA s + fj, a dA*). (7.1) 

Jo Jo Jo 

The measurement process Y t is given by (|5.9|l . and in what follows we write h t = 
jt(L + L*) and Z t = A t + A%. Note that the conditional expectation £ t = P(£t|£%) is 
well defined, and similarly for the coefficients fit, At and /if. 

The main filtering result for a process of the form 1|7.1[) is the following. 

Theorem 7.1 (Noncommutative Fujisaki-Kallianpur-Kunita). Under the above 
assumptions, the filtered process satisfies the QSDE 



d£ t =&tdt+ (A t +Z t ht- €tht) dW t 



(7.2) 
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where f t = ¥(r t \iV t ) for any r t affiliated to W t ' , and dW t = dY t — h t dt defines the 
W t - Wiener process (with respect toF) W t , called the innovations process. 

The filtering expression (|7.2f) is formally identical to the classical case Theo- 
rem 18.11], (631 Proposition 3.2]. Before we prove Theorem 17. II we will show how to 
obtain the quantum filtering equation (|6.7(l using this result. 

COROLLARY 7.2. The conditional state ^t{X) is given by Eq. 1(6.7(1 . 

Proof. We set A t = -j t ([X,L*]), Mt = jt([X,L}), I3 t = j t (C L , H ( x )), and & = 
Jt (X). Then &h t = 7T t (X(L + L*)), £ t h = n(X)n t (L + L*), \ t = -ir t ([X,L*]), and 
j3 t = TT t {C LM (X)). Hence using Eq. O, Eq. follows. □ 

Proof. (Theorem \7.1\j . Step 1. We first show that the process 



M t = 6 ~ Co 



As ds 



is a ^-martingale, i.e. V(M t \W s ) — M s for all s < t. This property is equivalent to 
P((Mt - M S )K) = for all K G W s , or equivalently 



Prdr) K 



it -is - / Prdr)K 



= P[(m t - m s )K] = 



for all K G where we have used Dcf. l3~T31 in the first step. But asif€f s C 3§®W S 



P[(m t - m s )K] = P 



if/ (A r cL4 r + p, r dA*) = P / (KX r dA r + K^ r dA* r 

J S J S 







where we have used that the vacuum expectation of quantum Ltd integrals vanishes. 
Thus we have demonstrated that M t is a ^-martingale. 

Step 2. We now show that Wt is a Wiener process under P. We begin by verifying 
that the innovations process 



W t = Y t 



ft, ds 



(7.3) 



is a ^-martingale. We need to show that P[(W / t — W s )ii"] = for any s < t and 
K G 'W s . This is equivalent to 



Yt-Y* 



h r dr K 



Yt-Ys 



h r dr K 







for all K G where the second expression follows from the definition of the condi- 
tional expectation. But from ((5.9|) we obtain 



Y t -Y m - h r dr\K 



= F[(A t - A S )K] = 



as if G <W S C 88®W s] , (A t - A s ) g W [sA and hence F[{A t -A s )K] = P(K)P(A t -A s ) = 
0. Thus Wt is a ^-martingale. 

From 1(7. 3|) we read off the Ito rule dW^ = dt; classically, a process that obeys this 
property and is a martingale must be a Wiener process by Levy's Theorem (e.g. (291 
Lemma 18.7]). But we can simply apply the classical result, as Wt is a commutative 
process (note that h t G 'Wt for s < t by construction) and is hence equivalent to the 
corresponding classical process obtained through the spectral theorem. 
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Now that we have shown that W t is a Wiener process, we can try to represent the 
martingale M t as a stochastic integral with respect to W t ■ As usual in filtering theory 
the ordinary martingale representation theorem does not suffice for this purpose, but 
the representation theorem of Fujisaki-Kallianpur-Kunita (e.g. |48l Theorem 5.20]) 
allows us to conclude nonetheless that 

M t = f ls dW s it=io+ I Psds+ f ls dW s (7.4) 

Jo Jo Jo 

for some adapted process jt S 

Step 3. We next obtain a first expression for £(Yt: 

m = [ 0sY s + €h s + X s ]ds + Nh{t), (7.5) 
Jo 

where M\(t) is a ^-martingale. As before, it suffices to show that 



P((Mi(t) - M 1 (s))K) = P 



m - j [f3 s Y s + £ a h a + X,]cbj K 



= 



for all K G where we have used the definition of the conditional expectation. But 

d(&Y t ) = (d£t)Y t + bdYt + dtitdYt 

= [fitdt + dm t )Y t + £ t {hdt + dZ t ) + dm t dZ t 

= {PtYt + Ztht + Xt)dt + (Y t X t + £ t )dAt + (Yt/M + &)dA*. 

Hence exactly as before, it follows that Mi(t) is a ^-martingale. 
Step 4- Next, we derive a second expression for £*Yt: 

itYt = [ S Y S + Ik + 7s ]ds + M 2 (t), (7.6) 
Jo 

where M%(t) is a ^-martingale. To show this, note that £ t Y" t = £jYt- By Ito's rules, 

d(itY t ) = (di t )Y t + i t dY t + di t dY t 

= t dt + j t dW t )Y t + it(h t dt + dW t ) + jtdWtdWt 
= 0tY t + itht + lt)dt + { lt Y t + i t )dW t 

which establishes (17. 611 . 

Step 5. We can now identify j t ■ From l|7.5|l and l|7.6|l we have two representations 
for £fYt- By uniqueness, it follows that the finite variation terms are equal, viz. 

PsY s + €h s + A s = $ a Y, + ish s + 7 S . 
Therefore 7 S = £ s h s + X s — £ s hs as required. □ 
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